SlideShare ist ein Scribd-Unternehmen logo
1 von 82
Apache Kylin
Balance between Space and Time
Debashis Saha | Luke Han
2015-06-09
http://kylin.io | @ApacheKylin
About us
§Debashis Saha (@debashis_saha)
- VP, eBay Cloud Services (Platform, Infrastructure, Data)
§Luke Han (@lukehq)
- Sr. Product Manager, Analytics Data Infrastructure
- Committer & PMC Member of Apache Kylin
Agenda
§About Apache Kylin
§Feature Highlights
§Tech Highlights
§Roadmap
§Q&A
http://kylin.io
What
kylin	
  	
  /	
  ˈkiːˈlɪn	
  /	
  麒麟
-­‐-­‐n.	
  (in	
  Chinese	
  art)	
  a	
  mythical	
  animal	
  of	
  composite	
  form	
  
http://kylin.io
What
kylin	
  	
  /	
  ˈkiːˈlɪn	
  /	
  麒麟
-­‐-­‐n.	
  (in	
  Chinese	
  art)	
  a	
  mythical	
  animal	
  of	
  composite	
  form	
  
http://kylin.io
@ApacheKylin
What
kylin	
  	
  /	
  ˈkiːˈlɪn	
  /	
  麒麟
-­‐-­‐n.	
  (in	
  Chinese	
  art)	
  a	
  mythical	
  animal	
  of	
  composite	
  form	
  
Extreme OLAP Engine for Big Data
Kylin is an open source Distributed Analytics Engine, contributed
by eBay Inc., provides SQL interface and multi-dimensional
analysis (OLAP) on Hadoop supporting extremely large datasets
http://kylin.io
@ApacheKylin
What
kylin	
  	
  /	
  ˈkiːˈlɪn	
  /	
  麒麟
-­‐-­‐n.	
  (in	
  Chinese	
  art)	
  a	
  mythical	
  animal	
  of	
  composite	
  form	
  
Extreme OLAP Engine for Big Data
Kylin is an open source Distributed Analytics Engine, contributed
by eBay Inc., provides SQL interface and multi-dimensional
analysis (OLAP) on Hadoop supporting extremely large datasets
http://kylin.io
• Open	
  Sourced	
  on	
  Oct	
  1st,	
  2014	
  
• Accepted	
  as	
  Apache	
  Incubator	
  Project	
  on	
  Nov	
  25th,	
  2014	
  
• http://kylin.io	
  (http://kylin.incubator.apache.org)
@ApacheKylin
§ External
§ 25+ contributors in community
§ Adoption:
§ On Production: Baidu Map
§ Evaluation: Huawei, Bloomberg Law, British Gas, JD.com
Microsoft, Tableau…
§ eBay Internal Cases
- 90% ile query < 5 seconds
Case Cube	
  Size Raw	
  Records
User	
  Session	
  Analysis 26	
  TB 28+	
  billion	
  rows
Traffic	
  Analysis 21	
  TB 20+	
  billion	
  rows
Behavior	
  Analysis 560	
  GB 1.2+	
  billion	
  rows
—from mailing list
Who are using Kylin?
http://kylin.io
Why
http://kylin.io
Why
http://kylin.io
Happiness
Latency
10s
Why
http://kylin.io
Happiness
Latency
10s
size
Why
http://kylin.io
Happiness
Latency
10s
size
Balance Between Space and Time
http://kylin.io
Balance Between Space and Time
http://kylin.io
time, item
time, item, location
time, item, location, supplier
time item location supplier
time, location
Time, supplier
item, location
item, supplier
location, supplier
time, item, supplier
time, location, supplier
item, location, supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
• Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells
1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier>
2. (9/15, milk, Urbana, *) - <time, item, location>
3. (*, milk, Urbana, *) - <item, location>
4. (*, milk, Chicago, *) - <item, location>
5. (*, milk, *, *) - <item>
Balance Between Space and Time
http://kylin.io
time, item
time, item, location
time, item, location, supplier
time item location supplier
time, location
Time, supplier
item, location
item, supplier
location, supplier
time, item, supplier
time, location, supplier
item, location, supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
• Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells
1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier>
2. (9/15, milk, Urbana, *) - <time, item, location>
3. (*, milk, Urbana, *) - <item, location>
4. (*, milk, Chicago, *) - <item, location>
5. (*, milk, *, *) - <item>
• Cuboid = one combination of dimensions
• Cube = all combination of dimensions
(all cuboids)
OLAP Cube
How
http://kylin.io
How
http://kylin.io
How
http://kylin.io
Kylin
How
http://kylin.io
Kylin
How
http://kylin.io
Map Reduce
Kylin
How
http://kylin.io
Map Reduce
Kylin
How
http://kylin.io
Map Reduce
Kylin
BI Tools,Web App…
ANSI SQL
Agenda
§About Apache Kylin
§Feature Highlights
§Tech Highlights
§Roadmap
§Q&A
http://kylin.io
Feature Highlights
• Extremely Fast OLAP Engine at scale
• ANSI SQL Interface on Hadoop
• Seamless Integration with BI Tools, like Tableau
• Interactive Query Capability
• MOLAP Cube
• Incremental Build of Cubes
• Approximate Query Capability for Distinct Count (HyperLogLog)
• Leverage HBase Coprocessor for query latency
• Job Management and Monitoring
• User friendly Web GUI for manage, build, monitor and query cubes
• Security capability to set ACL at Cube/Project Level
• Support LDAP Integration
http://kylin.io
Define Data Model
http://kylin.io
Define Data Model
http://kylin.io
Define Data Model
http://kylin.io
Define Data Model
http://kylin.io
Define Data Model
http://kylin.io
Define Data Model
http://kylin.io
Manage Jobs
http://kylin.io
Manage Jobs
http://kylin.io
Manage Jobs
http://kylin.io
Manage Jobs
http://kylin.io
Manage Jobs
http://kylin.io
Manage Jobs
http://kylin.io
Explore the Data
http://kylin.io
Explore the Data
http://kylin.io
Explore the Data
http://kylin.io
Explore the Data
http://kylin.io
Explore the Data
http://kylin.io
Interactive with BI Tool - Tableau
http://kylin.io
Interactive with BI Tool - Tableau
http://kylin.io
Interactive with BI Tool - Tableau
http://kylin.io
Interactive with BI Tool - Tableau
http://kylin.io
Interactive with BI Tool - Tableau
http://kylin.io
Interactive with BI Tool - Tableau
http://kylin.io
Interactive with BI Tool - Tableau
http://kylin.io
Agenda
§About Apache Kylin
§Feature Highlights
§Tech Highlights
§Roadmap
§Q&A
http://kylin.io
http://kylin.io
Cube	
  Build	
  Engine	
  
(MapReduce…)
SQL
Low	
  	
  Latency	
  -­‐	
  SecondsMid	
  Latency	
  -­‐	
  Minutes
Routing
3rd	
  Party	
  App	
  
(Web	
  App,	
  Mobile…)
Metadata
SQL-­‐Based	
  Tool	
  
(BI	
  Tools:	
  Tableau…)
Query	
  Engine
Hadoop
Hive
REST	
  API JDBC/ODBC
➢Online	
  Analysis	
  Data	
  Flow	
  
➢Offline	
  Data	
  Flow	
  
➢Clients/Users	
  interactive	
  with	
  Kylin	
  
via	
  SQL	
  
➢OLAP	
  Cube	
  is	
  transparent	
  to	
  users
Star	
  Schema	
  Data Key	
  Value	
  Data
Data	
  
Cube
OLAP	
  
Cube	
  
(HBase)
SQL
REST	
  Server
Kylin Architecture Overview
http://kylin.io
Cube:	
  …	
  
Fact	
  Table:	
  …	
  
Dimensions:	
  …	
  
Measures:	
  …	
  
Storage(HBase):	
  …
Fact
Dim Dim
Dim
Source	
  
Star	
  Schema
row	
  A
row	
  B
row	
  C
Column	
  Family
Val	
  1
Val	
  2
Val	
  3
Row	
  Key Column
Target	
  	
  
HBase	
  Storage
Mapping	
  
Cube	
  Metadata
End	
  User Cube	
  Modeler Admin
Data Modeling
http://kylin.io
Cube Build Job Flow
http://kylin.io
How to Store Cube - HBase Schema
http://kylin.io
Kylin Query Engine - Explain Plan
http://kylin.io
SELECT	
  test_cal_dt.week_beg_dt,	
  
test_category.category_name,	
  test_category.lvl2_name,	
  
test_category.lvl3_name,	
  test_kylin_fact.lstg_format_name,	
  	
  
test_sites.site_name,	
  SUM(test_kylin_fact.price)	
  AS	
  GMV,	
  
COUNT(*)	
  AS	
  TRANS_CNT	
  
FROM	
  	
  test_kylin_fact	
  	
  
	
  	
  LEFT	
  JOIN	
  test_cal_dt	
  ON	
  test_kylin_fact.cal_dt	
  =	
  
test_cal_dt.cal_dt	
  
	
  	
  LEFT	
  JOIN	
  test_category	
  ON	
  test_kylin_fact.leaf_categ_id	
  =	
  
test_category.leaf_categ_id	
  	
  AND	
  test_kylin_fact.lstg_site_id	
  
=	
  test_category.site_id	
  
	
  	
  LEFT	
  JOIN	
  test_sites	
  ON	
  test_kylin_fact.lstg_site_id	
  =	
  
test_sites.site_id	
  
WHERE	
  test_kylin_fact.seller_id	
  =	
  123456OR	
  
test_kylin_fact.lstg_format_name	
  =	
  ’New'	
  
GROUP	
  BY	
  test_cal_dt.week_beg_dt,	
  
test_category.category_name,	
  test_category.lvl2_name,	
  
test_category.lvl3_name,	
  
test_kylin_fact.lstg_format_name,test_sites.site_name
Kylin Query Engine - Explain Plan
http://kylin.io
SELECT	
  test_cal_dt.week_beg_dt,	
  
test_category.category_name,	
  test_category.lvl2_name,	
  
test_category.lvl3_name,	
  test_kylin_fact.lstg_format_name,	
  	
  
test_sites.site_name,	
  SUM(test_kylin_fact.price)	
  AS	
  GMV,	
  
COUNT(*)	
  AS	
  TRANS_CNT	
  
FROM	
  	
  test_kylin_fact	
  	
  
	
  	
  LEFT	
  JOIN	
  test_cal_dt	
  ON	
  test_kylin_fact.cal_dt	
  =	
  
test_cal_dt.cal_dt	
  
	
  	
  LEFT	
  JOIN	
  test_category	
  ON	
  test_kylin_fact.leaf_categ_id	
  =	
  
test_category.leaf_categ_id	
  	
  AND	
  test_kylin_fact.lstg_site_id	
  
=	
  test_category.site_id	
  
	
  	
  LEFT	
  JOIN	
  test_sites	
  ON	
  test_kylin_fact.lstg_site_id	
  =	
  
test_sites.site_id	
  
WHERE	
  test_kylin_fact.seller_id	
  =	
  123456OR	
  
test_kylin_fact.lstg_format_name	
  =	
  ’New'	
  
GROUP	
  BY	
  test_cal_dt.week_beg_dt,	
  
test_category.category_name,	
  test_category.lvl2_name,	
  
test_category.lvl3_name,	
  
test_kylin_fact.lstg_format_name,test_sites.site_name
OLAPToEnumerableConverter	
  
	
  	
  OLAPProjectRel(WEEK_BEG_DT=[$0],	
  category_name=[$1],	
  
CATEG_LVL2_NAME=[$2],	
  CATEG_LVL3_NAME=[$3],	
  LSTG_FORMAT_NAME=[$4],	
  
SITE_NAME=[$5],	
  GMV=[CASE(=($7,	
  0),	
  null,	
  $6)],	
  TRANS_CNT=[$8])	
  
	
  	
  	
  	
  OLAPAggregateRel(group=[{0,	
  1,	
  2,	
  3,	
  4,	
  5}],	
  agg#0=[$SUM0($6)],	
  
agg#1=[COUNT($6)],	
  TRANS_CNT=[COUNT()])	
  
	
  	
  	
  	
  	
  	
  OLAPProjectRel(WEEK_BEG_DT=[$13],	
  category_name=[$21],	
  
CATEG_LVL2_NAME=[$15],	
  CATEG_LVL3_NAME=[$14],	
  
LSTG_FORMAT_NAME=[$5],	
  SITE_NAME=[$23],	
  PRICE=[$0])	
  
	
  	
  	
  	
  	
  	
  	
  	
  OLAPFilterRel(condition=[OR(=($3,	
  123456),	
  =($5,	
  ’New'))])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPJoinRel(condition=[=($2,	
  $25)],	
  joinType=[left])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPJoinRel(condition=[AND(=($6,	
  $22),	
  =($2,	
  $17))],	
  joinType=[left])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPJoinRel(condition=[=($4,	
  $12)],	
  joinType=[left])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPTableScan(table=[[DEFAULT,	
  TEST_KYLIN_FACT]],	
  fields=[[0,	
  1,	
  2,	
  3,	
  
4,	
  5,	
  6,	
  7,	
  8,	
  9,	
  10,	
  11]])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPTableScan(table=[[DEFAULT,	
  TEST_CAL_DT]],	
  fields=[[0,	
  1]])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPTableScan(table=[[DEFAULT,	
  test_category]],	
  fields=[[0,	
  1,	
  2,	
  3,	
  4,	
  5,	
  
6,	
  7,	
  8]])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPTableScan(table=[[DEFAULT,	
  TEST_SITES]],	
  fields=[[0,	
  1,	
  2]])
Kylin Query Engine - Explain Plan
http://kylin.io
SELECT	
  test_cal_dt.week_beg_dt,	
  
test_category.category_name,	
  test_category.lvl2_name,	
  
test_category.lvl3_name,	
  test_kylin_fact.lstg_format_name,	
  	
  
test_sites.site_name,	
  SUM(test_kylin_fact.price)	
  AS	
  GMV,	
  
COUNT(*)	
  AS	
  TRANS_CNT	
  
FROM	
  	
  test_kylin_fact	
  	
  
	
  	
  LEFT	
  JOIN	
  test_cal_dt	
  ON	
  test_kylin_fact.cal_dt	
  =	
  
test_cal_dt.cal_dt	
  
	
  	
  LEFT	
  JOIN	
  test_category	
  ON	
  test_kylin_fact.leaf_categ_id	
  =	
  
test_category.leaf_categ_id	
  	
  AND	
  test_kylin_fact.lstg_site_id	
  
=	
  test_category.site_id	
  
	
  	
  LEFT	
  JOIN	
  test_sites	
  ON	
  test_kylin_fact.lstg_site_id	
  =	
  
test_sites.site_id	
  
WHERE	
  test_kylin_fact.seller_id	
  =	
  123456OR	
  
test_kylin_fact.lstg_format_name	
  =	
  ’New'	
  
GROUP	
  BY	
  test_cal_dt.week_beg_dt,	
  
test_category.category_name,	
  test_category.lvl2_name,	
  
test_category.lvl3_name,	
  
test_kylin_fact.lstg_format_name,test_sites.site_name
OLAPToEnumerableConverter	
  
	
  	
  OLAPProjectRel(WEEK_BEG_DT=[$0],	
  category_name=[$1],	
  
CATEG_LVL2_NAME=[$2],	
  CATEG_LVL3_NAME=[$3],	
  LSTG_FORMAT_NAME=[$4],	
  
SITE_NAME=[$5],	
  GMV=[CASE(=($7,	
  0),	
  null,	
  $6)],	
  TRANS_CNT=[$8])	
  
	
  	
  	
  	
  OLAPAggregateRel(group=[{0,	
  1,	
  2,	
  3,	
  4,	
  5}],	
  agg#0=[$SUM0($6)],	
  
agg#1=[COUNT($6)],	
  TRANS_CNT=[COUNT()])	
  
	
  	
  	
  	
  	
  	
  OLAPProjectRel(WEEK_BEG_DT=[$13],	
  category_name=[$21],	
  
CATEG_LVL2_NAME=[$15],	
  CATEG_LVL3_NAME=[$14],	
  
LSTG_FORMAT_NAME=[$5],	
  SITE_NAME=[$23],	
  PRICE=[$0])	
  
	
  	
  	
  	
  	
  	
  	
  	
  OLAPFilterRel(condition=[OR(=($3,	
  123456),	
  =($5,	
  ’New'))])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPJoinRel(condition=[=($2,	
  $25)],	
  joinType=[left])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPJoinRel(condition=[AND(=($6,	
  $22),	
  =($2,	
  $17))],	
  joinType=[left])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPJoinRel(condition=[=($4,	
  $12)],	
  joinType=[left])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPTableScan(table=[[DEFAULT,	
  TEST_KYLIN_FACT]],	
  fields=[[0,	
  1,	
  2,	
  3,	
  
4,	
  5,	
  6,	
  7,	
  8,	
  9,	
  10,	
  11]])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPTableScan(table=[[DEFAULT,	
  TEST_CAL_DT]],	
  fields=[[0,	
  1]])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPTableScan(table=[[DEFAULT,	
  test_category]],	
  fields=[[0,	
  1,	
  2,	
  3,	
  4,	
  5,	
  
6,	
  7,	
  8]])	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  OLAPTableScan(table=[[DEFAULT,	
  TEST_SITES]],	
  fields=[[0,	
  1,	
  2]])
Kylin Query Engine - Explain Plan
Cube Optimization
§ Curse	
  of	
  Dimensionality	
  
§ N	
  dimension	
  cube	
  has	
  2N	
  cuboid	
  
§ Full	
  Cube	
  vs.	
  Parqal	
  Cube	
  	
  
§ Huge	
  Data	
  Volume	
  	
  
§ Dicqonary	
  Encoding	
  
§ Incremental	
  Building
http://kylin.io
§ Full	
  Cube
- Pre-aggregate all dimension combinations
- “Curse of dimensionality”: N dimension cube has 2N cuboid.
§ ParOal	
  Cube
- To avoid dimension explosion, we divide the dimensions into different aggregation
groups
- 2N+M+L à 2N + 2M + 2L
- For cube with 30 dimensions, if we divide these dimensions into 3 group, the cuboid
number will reduce from 1 Billion to 3 Thousands
- 230 à 210 + 210 + 210
- Tradeoff between online aggregation and offline pre-aggregation
http://kylin.io
Full Cube vs. Partical Cube
http://kylin.io
Partical Cube
http://kylin.io
Incremental Build
What’s Next
§ Improve	
  cube	
  algorithm	
  
§ Cube	
  by	
  segments,	
  30%-­‐50%	
  faster	
  
§ Build	
  delay	
  down	
  to	
  tens	
  of	
  minutes	
  
§ Streaming	
  cubing	
  
§ Analyze	
  real-­‐qme	
  data	
  
§ Build	
  delay	
  down	
  to	
  seconds	
  
§ Spark
http://kylin.io
Cube by Layer
§ The current algorithm
- Many MRs, the number of
dimensions
- Huge shuffles, aggregation at
reduce side, 100x of total cube
size
http://kylin.io
Full	
  Data
0-­‐D	
  Cuboid
1-­‐D	
  Cuboid
2-­‐D	
  Cuboid
3-­‐D	
  Cuboid
4-­‐D	
  Cuboid
MR
MR
MR
MR
MR
Cube by Segments
§ The to-be algorithm,
30%-50% faster
- 1 round MR
- Reduced shuffles, map side
aggregation, 20x total cube
size
- Hourly incremental build done
in tens of minutes
http://kylin.io
Data	
  Split
Cube	
  Segment
Data	
  Split
Cube	
  Segment
Data	
  Split
Cube	
  Segment
……
Final	
  Cube
Merge	
  Sort	
  
(Shuffle)
mapper mapper mapper
Streaming Cubing
§ Cube is great but…
- Cube takes time to build, how about real-time analysis?
- Sometimes we want to drill down to row level information
§ Streaming cubing
- Build micro cube segments from streaming
- Use inverted index to capture last minute data
http://kylin.io
http://kylin.io
Streaming
seconds	
  delay
Last	
  Hour
Inverted	
  
Index
Before	
  Last	
  Hour
Cube
Kylin Lambda Architecture
minutes	
  delay
Query	
  Engine
ANSI	
  SQL
Hybrid Storage
Interface
Adding Spark Support
http://kylin.io
Adding Spark Support
§ Cubing Efficiency
§ MR is not optimal framework
§ Spark Cubing Engine
http://kylin.io
Adding Spark Support
§ Cubing Efficiency
§ MR is not optimal framework
§ Spark Cubing Engine
§ Source from SparkSQL
§ Read data from SparkSQL instead of Hive
http://kylin.io
Adding Spark Support
§ Cubing Efficiency
§ MR is not optimal framework
§ Spark Cubing Engine
§ Source from SparkSQL
§ Read data from SparkSQL instead of Hive
§ Route to SparkSQL
§ Unsupported queries be coved by SparkSQL
http://kylin.io
Agenda
§About Apache Kylin
§Feature Highlights
§Tech Highlights
§Roadmap
§Q&A
http://kylin.io
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Initial
Sep,	
  2013
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Initial
Prototype	
  for	
  
MOLAP	
  
• Basic	
  end	
  to	
  end	
  POC	
  
Sep,	
  2013
Jan,	
  2014
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Initial
Prototype	
  for	
  
MOLAP	
  
• Basic	
  end	
  to	
  end	
  POC	
  
MOLAP	
  
• Incremental	
  Refresh	
  
• ANSI	
  SQL	
  
• ODBC	
  Driver	
  
• Web	
  GUI	
  
• Tableau	
  
• ACL	
  
• Open	
  SourceSep,	
  2013
Jan,	
  2014
Oct,	
  2014
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Initial
Prototype	
  for	
  
MOLAP	
  
• Basic	
  end	
  to	
  end	
  POC	
  
MOLAP	
  
• Incremental	
  Refresh	
  
• ANSI	
  SQL	
  
• ODBC	
  Driver	
  
• Web	
  GUI	
  
• Tableau	
  
• ACL	
  
• Open	
  Source
StreamingOLAP	
  
• Streaming	
  OLAP	
  
• JDBC	
  Driver	
  
• New	
  UI	
  
• Excel	
  
• SparkSQL	
  
• …	
  more	
  
Sep,	
  2013
Jan,	
  2014
Oct,	
  2014
H1,	
  2015
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Initial
Prototype	
  for	
  
MOLAP	
  
• Basic	
  end	
  to	
  end	
  POC	
  
MOLAP	
  
• Incremental	
  Refresh	
  
• ANSI	
  SQL	
  
• ODBC	
  Driver	
  
• Web	
  GUI	
  
• Tableau	
  
• ACL	
  
• Open	
  Source
StreamingOLAP	
  
• Streaming	
  OLAP	
  
• JDBC	
  Driver	
  
• New	
  UI	
  
• Excel	
  
• SparkSQL	
  
• …	
  more	
  
Sep,	
  2013
Jan,	
  2014
Oct,	
  2014
H1,	
  2015
HybridOLAP	
  
• Lambda	
  Arch	
  
• Automation	
  
• Capacity	
  
Management	
  
• Spark	
  	
  
• …	
  more
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Initial
Prototype	
  for	
  
MOLAP	
  
• Basic	
  end	
  to	
  end	
  POC	
  
MOLAP	
  
• Incremental	
  Refresh	
  
• ANSI	
  SQL	
  
• ODBC	
  Driver	
  
• Web	
  GUI	
  
• Tableau	
  
• ACL	
  
• Open	
  Source
StreamingOLAP	
  
• Streaming	
  OLAP	
  
• JDBC	
  Driver	
  
• New	
  UI	
  
• Excel	
  
• SparkSQL	
  
• …	
  more	
  
Sep,	
  2013
Jan,	
  2014
Oct,	
  2014
H1,	
  2015
HybridOLAP	
  
• Lambda	
  Arch	
  
• Automation	
  
• Capacity	
  
Management	
  
• Spark	
  	
  
• …	
  more
Next	
  Gen	
  
• Adv	
  OLAP	
  Functions	
  
• In-­‐Memory	
  Analysis	
  
(TBD)	
  
• Mobile	
  (TBD)	
  
• …	
  more
Future2015 2016
Kylin Evolution Roadmap
http://kylin.io
20142013
Initial
Prototype	
  for	
  
MOLAP	
  
• Basic	
  end	
  to	
  end	
  POC	
  
MOLAP	
  
• Incremental	
  Refresh	
  
• ANSI	
  SQL	
  
• ODBC	
  Driver	
  
• Web	
  GUI	
  
• Tableau	
  
• ACL	
  
• Open	
  Source
StreamingOLAP	
  
• Streaming	
  OLAP	
  
• JDBC	
  Driver	
  
• New	
  UI	
  
• Excel	
  
• SparkSQL	
  
• …	
  more	
  
TBD
Sep,	
  2013
Jan,	
  2014
Oct,	
  2014
H1,	
  2015
HybridOLAP	
  
• Lambda	
  Arch	
  
• Automation	
  
• Capacity	
  
Management	
  
• Spark	
  	
  
• …	
  more
Next	
  Gen	
  
• Adv	
  OLAP	
  Functions	
  
• In-­‐Memory	
  Analysis	
  
(TBD)	
  
• Mobile	
  (TBD)	
  
• …	
  more
Kylin Ecosystem
■ Kylin Core
■ Fundamental framework of Kylin OLAP Engine
■ Extension
■ Plugins to support for additional functions and features
■ Integration
■ Lifecycle Management Support to integrate with other
applications
■ Interface
■ Allows for third party users to build more features via user-
interface atop Kylin core
http://kylin.io
Kylin OLAP
Core
Extension
à Security
à Redis Storage
à Spark Engine
à Docker
Interface
à Web Console
à Customized BI
à Ambari/Hue Plugin
Integration
à ODBC Driver
à ETL
à Drill
à SparkSQL
http://kylin.io
If	
  you	
  want	
  to	
  go	
  fast,	
  go	
  alone.	
  
If	
  you	
  want	
  to	
  go	
  far,	
  go	
  together.
-­‐-­‐African	
  Proverb
dev@kylin.incubator.apache.org

Weitere ähnliche Inhalte

Was ist angesagt?

Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan BlueDatabricks
 
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Kevin Weil
 
Apache Kylin – Cubes on Hadoop
Apache Kylin – Cubes on HadoopApache Kylin – Cubes on Hadoop
Apache Kylin – Cubes on HadoopDataWorks Summit
 
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersFrom HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersDatabricks
 
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...HostedbyConfluent
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with RedisGeorge Platon
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)SANG WON PARK
 
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...DataStax Academy
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementationSimon Su
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsGuozhang Wang
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...StreamNative
 
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop AdoptionYARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop AdoptionDataWorks Summit
 
Big Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache KylinBig Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylininovex GmbH
 
MySQL Server Settings Tuning
MySQL Server Settings TuningMySQL Server Settings Tuning
MySQL Server Settings Tuningguest5ca94b
 
SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialDaniel Abadi
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesYoshinori Matsunobu
 
Shuffle phase as the bottleneck in Hadoop Terasort
Shuffle phase as the bottleneck in Hadoop TerasortShuffle phase as the bottleneck in Hadoop Terasort
Shuffle phase as the bottleneck in Hadoop Terasortpramodbiligiri
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 

Was ist angesagt? (20)

Spark and S3 with Ryan Blue
Spark and S3 with Ryan BlueSpark and S3 with Ryan Blue
Spark and S3 with Ryan Blue
 
Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)Rainbird: Realtime Analytics at Twitter (Strata 2011)
Rainbird: Realtime Analytics at Twitter (Strata 2011)
 
Apache Kylin – Cubes on Hadoop
Apache Kylin – Cubes on HadoopApache Kylin – Cubes on Hadoop
Apache Kylin – Cubes on Hadoop
 
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark ClustersFrom HDFS to S3: Migrate Pinterest Apache Spark Clusters
From HDFS to S3: Migrate Pinterest Apache Spark Clusters
 
TiDB Introduction
TiDB IntroductionTiDB Introduction
TiDB Introduction
 
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...
Data in Motion: Building Stream-Based Architectures with Qlik Replicate & Kaf...
 
Caching solutions with Redis
Caching solutions   with RedisCaching solutions   with Redis
Caching solutions with Redis
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
 
Druid
DruidDruid
Druid
 
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
Cassandra Day NY 2014: Apache Cassandra & Python for the The New York Times ⨍...
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementation
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop AdoptionYARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
YARN: the Key to overcoming the challenges of broad-based Hadoop Adoption
 
Big Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache KylinBig Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylin
 
MySQL Server Settings Tuning
MySQL Server Settings TuningMySQL Server Settings Tuning
MySQL Server Settings Tuning
 
SQL-on-Hadoop Tutorial
SQL-on-Hadoop TutorialSQL-on-Hadoop Tutorial
SQL-on-Hadoop Tutorial
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Shuffle phase as the bottleneck in Hadoop Terasort
Shuffle phase as the bottleneck in Hadoop TerasortShuffle phase as the bottleneck in Hadoop Terasort
Shuffle phase as the bottleneck in Hadoop Terasort
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 

Andere mochten auch

From Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllFrom Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllDataWorks Summit
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicDataWorks Summit
 
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesAirflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesDataWorks Summit
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitDataWorks Summit
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachDataWorks Summit
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopHadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopDataWorks Summit
 
June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2DataWorks Summit
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application ResourcesDataWorks Summit
 
large scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraphlarge scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache GiraphDataWorks Summit
 
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...DataWorks Summit
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterDataWorks Summit
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresDataWorks Summit
 
Sqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionSqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionDataWorks Summit
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course WorkshopDataWorks Summit
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceDataWorks Summit
 
How to use Parquet as a Sasis for ETL and Analytics
How to use Parquet as a Sasis for ETL and AnalyticsHow to use Parquet as a Sasis for ETL and Analytics
How to use Parquet as a Sasis for ETL and AnalyticsDataWorks Summit
 
Apache Lens: Unified OLAP on Realtime and Historic Data
Apache Lens: Unified OLAP on Realtime and Historic DataApache Lens: Unified OLAP on Realtime and Historic Data
Apache Lens: Unified OLAP on Realtime and Historic DataDataWorks Summit
 
Internet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitInternet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitDataWorks Summit
 
Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitSpark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitDataWorks Summit
 
Applied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4jApplied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4jDataWorks Summit
 

Andere mochten auch (20)

From Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for AllFrom Beginners to Experts, Data Wrangling for All
From Beginners to Experts, Data Wrangling for All
 
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicBig Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic
 
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data PipelinesAirflow - An Open Source Platform to Author and Monitor Data Pipelines
Airflow - An Open Source Platform to Author and Monitor Data Pipelines
 
Hadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop SummitHadoop crash course workshop at Hadoop Summit
Hadoop crash course workshop at Hadoop Summit
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
 
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay HadoopHadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
Hadoop Eagle - Real Time Monitoring Framework for eBay Hadoop
 
June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2June 10 145pm hortonworks_tan & welch_v2
June 10 145pm hortonworks_tan & welch_v2
 
a Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resourcesa Secure Public Cache for YARN Application Resources
a Secure Public Cache for YARN Application Resources
 
large scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraphlarge scale collaborative filtering using Apache Giraph
large scale collaborative filtering using Apache Giraph
 
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
Bigger, Faster, Easier: Building a Real-Time Self Service Data Analytics Ecos...
 
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at TwitterHadoop Performance Optimization at Scale, Lessons Learned at Twitter
Hadoop Performance Optimization at Scale, Lessons Learned at Twitter
 
Scaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value StoresScaling HDFS to Manage Billions of Files with Key-Value Stores
Scaling HDFS to Manage Billions of Files with Key-Value Stores
 
Sqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionSqoop on Spark for Data Ingestion
Sqoop on Spark for Data Ingestion
 
Internet of things Crash Course Workshop
Internet of things Crash Course WorkshopInternet of things Crash Course Workshop
Internet of things Crash Course Workshop
 
Improving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of ServiceImproving HDFS Availability with IPC Quality of Service
Improving HDFS Availability with IPC Quality of Service
 
How to use Parquet as a Sasis for ETL and Analytics
How to use Parquet as a Sasis for ETL and AnalyticsHow to use Parquet as a Sasis for ETL and Analytics
How to use Parquet as a Sasis for ETL and Analytics
 
Apache Lens: Unified OLAP on Realtime and Historic Data
Apache Lens: Unified OLAP on Realtime and Historic DataApache Lens: Unified OLAP on Realtime and Historic Data
Apache Lens: Unified OLAP on Realtime and Historic Data
 
Internet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop SummitInternet of Things Crash Course Workshop at Hadoop Summit
Internet of Things Crash Course Workshop at Hadoop Summit
 
Spark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop SummitSpark crash course workshop at Hadoop Summit
Spark crash course workshop at Hadoop Summit
 
Applied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4jApplied Deep Learning with Spark and Deeplearning4j
Applied Deep Learning with Spark and Deeplearning4j
 

Ähnlich wie Apache Kylin - Balance Between Space and Time

Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin IntroductionLuke Han
 
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingApache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingLuke Han
 
Apache Kylin Streaming
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming hongbin ma
 
Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)qhzhou
 
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 DecApache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 DecYang Li
 
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for HadoopHBaseCon
 
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015Luke Han
 
Apache Kylin - Balance between space and time - Hadoop Summit 2015
Apache Kylin -  Balance between space and time - Hadoop Summit 2015Apache Kylin -  Balance between space and time - Hadoop Summit 2015
Apache Kylin - Balance between space and time - Hadoop Summit 2015Debashis Saha
 
Kylin OLAP Engine Tour
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine TourLuke Han
 
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataLuke Han
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...Luke Han
 
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanLuke Han
 
Apache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and JapanApache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and JapanLuke Han
 
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveXu Jiang
 
Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Seshu Adunuthula
 
Apache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetssuser931288
 
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetApache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetChun'en Ni
 
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouseApache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouseYang Li
 
Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20Qiming Teng
 
Leaving the Ivory Tower: Research in the Real World
Leaving the Ivory Tower: Research in the Real WorldLeaving the Ivory Tower: Research in the Real World
Leaving the Ivory Tower: Research in the Real WorldArmonDadgar
 

Ähnlich wie Apache Kylin - Balance Between Space and Time (20)

Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
 
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingApache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 Beijing
 
Apache Kylin Streaming
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming
 
Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)
 
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 DecApache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
 
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
 
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015
 
Apache Kylin - Balance between space and time - Hadoop Summit 2015
Apache Kylin -  Balance between space and time - Hadoop Summit 2015Apache Kylin -  Balance between space and time - Hadoop Summit 2015
Apache Kylin - Balance between space and time - Hadoop Summit 2015
 
Kylin OLAP Engine Tour
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine Tour
 
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big Data
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
 
The Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke HanThe Evolution of Apache Kylin by Luke Han
The Evolution of Apache Kylin by Luke Han
 
Apache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and JapanApache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and Japan
 
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep DiveApache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
Apache Kylin: OLAP Engine on Hadoop - Tech Deep Dive
 
Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015
 
Apache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
 
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetApache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large dataset
 
Apache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouseApache kylin 2.0: from classic olap to real-time data warehouse
Apache kylin 2.0: from classic olap to real-time data warehouse
 
Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20Senlin deep dive 2015 05-20
Senlin deep dive 2015 05-20
 
Leaving the Ivory Tower: Research in the Real World
Leaving the Ivory Tower: Research in the Real WorldLeaving the Ivory Tower: Research in the Real World
Leaving the Ivory Tower: Research in the Real World
 

Mehr von DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Mehr von DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Kürzlich hochgeladen

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Kürzlich hochgeladen (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Apache Kylin - Balance Between Space and Time

  • 1. Apache Kylin Balance between Space and Time Debashis Saha | Luke Han 2015-06-09 http://kylin.io | @ApacheKylin
  • 2. About us §Debashis Saha (@debashis_saha) - VP, eBay Cloud Services (Platform, Infrastructure, Data) §Luke Han (@lukehq) - Sr. Product Manager, Analytics Data Infrastructure - Committer & PMC Member of Apache Kylin
  • 3. Agenda §About Apache Kylin §Feature Highlights §Tech Highlights §Roadmap §Q&A http://kylin.io
  • 4. What kylin    /  ˈkiːˈlɪn  /  麒麟 -­‐-­‐n.  (in  Chinese  art)  a  mythical  animal  of  composite  form   http://kylin.io
  • 5. What kylin    /  ˈkiːˈlɪn  /  麒麟 -­‐-­‐n.  (in  Chinese  art)  a  mythical  animal  of  composite  form   http://kylin.io @ApacheKylin
  • 6. What kylin    /  ˈkiːˈlɪn  /  麒麟 -­‐-­‐n.  (in  Chinese  art)  a  mythical  animal  of  composite  form   Extreme OLAP Engine for Big Data Kylin is an open source Distributed Analytics Engine, contributed by eBay Inc., provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets http://kylin.io @ApacheKylin
  • 7. What kylin    /  ˈkiːˈlɪn  /  麒麟 -­‐-­‐n.  (in  Chinese  art)  a  mythical  animal  of  composite  form   Extreme OLAP Engine for Big Data Kylin is an open source Distributed Analytics Engine, contributed by eBay Inc., provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets http://kylin.io • Open  Sourced  on  Oct  1st,  2014   • Accepted  as  Apache  Incubator  Project  on  Nov  25th,  2014   • http://kylin.io  (http://kylin.incubator.apache.org) @ApacheKylin
  • 8. § External § 25+ contributors in community § Adoption: § On Production: Baidu Map § Evaluation: Huawei, Bloomberg Law, British Gas, JD.com Microsoft, Tableau… § eBay Internal Cases - 90% ile query < 5 seconds Case Cube  Size Raw  Records User  Session  Analysis 26  TB 28+  billion  rows Traffic  Analysis 21  TB 20+  billion  rows Behavior  Analysis 560  GB 1.2+  billion  rows —from mailing list Who are using Kylin? http://kylin.io
  • 13. Balance Between Space and Time http://kylin.io
  • 14. Balance Between Space and Time http://kylin.io time, item time, item, location time, item, location, supplier time item location supplier time, location Time, supplier item, location item, supplier location, supplier time, item, supplier time, location, supplier item, location, supplier 0-D(apex) cuboid 1-D cuboids 2-D cuboids 3-D cuboids 4-D(base) cuboid • Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells 1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier> 2. (9/15, milk, Urbana, *) - <time, item, location> 3. (*, milk, Urbana, *) - <item, location> 4. (*, milk, Chicago, *) - <item, location> 5. (*, milk, *, *) - <item>
  • 15. Balance Between Space and Time http://kylin.io time, item time, item, location time, item, location, supplier time item location supplier time, location Time, supplier item, location item, supplier location, supplier time, item, supplier time, location, supplier item, location, supplier 0-D(apex) cuboid 1-D cuboids 2-D cuboids 3-D cuboids 4-D(base) cuboid • Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells 1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier> 2. (9/15, milk, Urbana, *) - <time, item, location> 3. (*, milk, Urbana, *) - <item, location> 4. (*, milk, Chicago, *) - <item, location> 5. (*, milk, *, *) - <item> • Cuboid = one combination of dimensions • Cube = all combination of dimensions (all cuboids) OLAP Cube
  • 23. Agenda §About Apache Kylin §Feature Highlights §Tech Highlights §Roadmap §Q&A http://kylin.io
  • 24. Feature Highlights • Extremely Fast OLAP Engine at scale • ANSI SQL Interface on Hadoop • Seamless Integration with BI Tools, like Tableau • Interactive Query Capability • MOLAP Cube • Incremental Build of Cubes • Approximate Query Capability for Distinct Count (HyperLogLog) • Leverage HBase Coprocessor for query latency • Job Management and Monitoring • User friendly Web GUI for manage, build, monitor and query cubes • Security capability to set ACL at Cube/Project Level • Support LDAP Integration http://kylin.io
  • 42. Interactive with BI Tool - Tableau http://kylin.io
  • 43. Interactive with BI Tool - Tableau http://kylin.io
  • 44. Interactive with BI Tool - Tableau http://kylin.io
  • 45. Interactive with BI Tool - Tableau http://kylin.io
  • 46. Interactive with BI Tool - Tableau http://kylin.io
  • 47. Interactive with BI Tool - Tableau http://kylin.io
  • 48. Interactive with BI Tool - Tableau http://kylin.io
  • 49. Agenda §About Apache Kylin §Feature Highlights §Tech Highlights §Roadmap §Q&A http://kylin.io
  • 50. http://kylin.io Cube  Build  Engine   (MapReduce…) SQL Low    Latency  -­‐  SecondsMid  Latency  -­‐  Minutes Routing 3rd  Party  App   (Web  App,  Mobile…) Metadata SQL-­‐Based  Tool   (BI  Tools:  Tableau…) Query  Engine Hadoop Hive REST  API JDBC/ODBC ➢Online  Analysis  Data  Flow   ➢Offline  Data  Flow   ➢Clients/Users  interactive  with  Kylin   via  SQL   ➢OLAP  Cube  is  transparent  to  users Star  Schema  Data Key  Value  Data Data   Cube OLAP   Cube   (HBase) SQL REST  Server Kylin Architecture Overview
  • 51. http://kylin.io Cube:  …   Fact  Table:  …   Dimensions:  …   Measures:  …   Storage(HBase):  … Fact Dim Dim Dim Source   Star  Schema row  A row  B row  C Column  Family Val  1 Val  2 Val  3 Row  Key Column Target     HBase  Storage Mapping   Cube  Metadata End  User Cube  Modeler Admin Data Modeling
  • 53. http://kylin.io How to Store Cube - HBase Schema
  • 55. http://kylin.io SELECT  test_cal_dt.week_beg_dt,   test_category.category_name,  test_category.lvl2_name,   test_category.lvl3_name,  test_kylin_fact.lstg_format_name,     test_sites.site_name,  SUM(test_kylin_fact.price)  AS  GMV,   COUNT(*)  AS  TRANS_CNT   FROM    test_kylin_fact        LEFT  JOIN  test_cal_dt  ON  test_kylin_fact.cal_dt  =   test_cal_dt.cal_dt      LEFT  JOIN  test_category  ON  test_kylin_fact.leaf_categ_id  =   test_category.leaf_categ_id    AND  test_kylin_fact.lstg_site_id   =  test_category.site_id      LEFT  JOIN  test_sites  ON  test_kylin_fact.lstg_site_id  =   test_sites.site_id   WHERE  test_kylin_fact.seller_id  =  123456OR   test_kylin_fact.lstg_format_name  =  ’New'   GROUP  BY  test_cal_dt.week_beg_dt,   test_category.category_name,  test_category.lvl2_name,   test_category.lvl3_name,   test_kylin_fact.lstg_format_name,test_sites.site_name Kylin Query Engine - Explain Plan
  • 56. http://kylin.io SELECT  test_cal_dt.week_beg_dt,   test_category.category_name,  test_category.lvl2_name,   test_category.lvl3_name,  test_kylin_fact.lstg_format_name,     test_sites.site_name,  SUM(test_kylin_fact.price)  AS  GMV,   COUNT(*)  AS  TRANS_CNT   FROM    test_kylin_fact        LEFT  JOIN  test_cal_dt  ON  test_kylin_fact.cal_dt  =   test_cal_dt.cal_dt      LEFT  JOIN  test_category  ON  test_kylin_fact.leaf_categ_id  =   test_category.leaf_categ_id    AND  test_kylin_fact.lstg_site_id   =  test_category.site_id      LEFT  JOIN  test_sites  ON  test_kylin_fact.lstg_site_id  =   test_sites.site_id   WHERE  test_kylin_fact.seller_id  =  123456OR   test_kylin_fact.lstg_format_name  =  ’New'   GROUP  BY  test_cal_dt.week_beg_dt,   test_category.category_name,  test_category.lvl2_name,   test_category.lvl3_name,   test_kylin_fact.lstg_format_name,test_sites.site_name OLAPToEnumerableConverter      OLAPProjectRel(WEEK_BEG_DT=[$0],  category_name=[$1],   CATEG_LVL2_NAME=[$2],  CATEG_LVL3_NAME=[$3],  LSTG_FORMAT_NAME=[$4],   SITE_NAME=[$5],  GMV=[CASE(=($7,  0),  null,  $6)],  TRANS_CNT=[$8])          OLAPAggregateRel(group=[{0,  1,  2,  3,  4,  5}],  agg#0=[$SUM0($6)],   agg#1=[COUNT($6)],  TRANS_CNT=[COUNT()])              OLAPProjectRel(WEEK_BEG_DT=[$13],  category_name=[$21],   CATEG_LVL2_NAME=[$15],  CATEG_LVL3_NAME=[$14],   LSTG_FORMAT_NAME=[$5],  SITE_NAME=[$23],  PRICE=[$0])                  OLAPFilterRel(condition=[OR(=($3,  123456),  =($5,  ’New'))])                      OLAPJoinRel(condition=[=($2,  $25)],  joinType=[left])                          OLAPJoinRel(condition=[AND(=($6,  $22),  =($2,  $17))],  joinType=[left])                              OLAPJoinRel(condition=[=($4,  $12)],  joinType=[left])                                  OLAPTableScan(table=[[DEFAULT,  TEST_KYLIN_FACT]],  fields=[[0,  1,  2,  3,   4,  5,  6,  7,  8,  9,  10,  11]])                                  OLAPTableScan(table=[[DEFAULT,  TEST_CAL_DT]],  fields=[[0,  1]])                              OLAPTableScan(table=[[DEFAULT,  test_category]],  fields=[[0,  1,  2,  3,  4,  5,   6,  7,  8]])                          OLAPTableScan(table=[[DEFAULT,  TEST_SITES]],  fields=[[0,  1,  2]]) Kylin Query Engine - Explain Plan
  • 57. http://kylin.io SELECT  test_cal_dt.week_beg_dt,   test_category.category_name,  test_category.lvl2_name,   test_category.lvl3_name,  test_kylin_fact.lstg_format_name,     test_sites.site_name,  SUM(test_kylin_fact.price)  AS  GMV,   COUNT(*)  AS  TRANS_CNT   FROM    test_kylin_fact        LEFT  JOIN  test_cal_dt  ON  test_kylin_fact.cal_dt  =   test_cal_dt.cal_dt      LEFT  JOIN  test_category  ON  test_kylin_fact.leaf_categ_id  =   test_category.leaf_categ_id    AND  test_kylin_fact.lstg_site_id   =  test_category.site_id      LEFT  JOIN  test_sites  ON  test_kylin_fact.lstg_site_id  =   test_sites.site_id   WHERE  test_kylin_fact.seller_id  =  123456OR   test_kylin_fact.lstg_format_name  =  ’New'   GROUP  BY  test_cal_dt.week_beg_dt,   test_category.category_name,  test_category.lvl2_name,   test_category.lvl3_name,   test_kylin_fact.lstg_format_name,test_sites.site_name OLAPToEnumerableConverter      OLAPProjectRel(WEEK_BEG_DT=[$0],  category_name=[$1],   CATEG_LVL2_NAME=[$2],  CATEG_LVL3_NAME=[$3],  LSTG_FORMAT_NAME=[$4],   SITE_NAME=[$5],  GMV=[CASE(=($7,  0),  null,  $6)],  TRANS_CNT=[$8])          OLAPAggregateRel(group=[{0,  1,  2,  3,  4,  5}],  agg#0=[$SUM0($6)],   agg#1=[COUNT($6)],  TRANS_CNT=[COUNT()])              OLAPProjectRel(WEEK_BEG_DT=[$13],  category_name=[$21],   CATEG_LVL2_NAME=[$15],  CATEG_LVL3_NAME=[$14],   LSTG_FORMAT_NAME=[$5],  SITE_NAME=[$23],  PRICE=[$0])                  OLAPFilterRel(condition=[OR(=($3,  123456),  =($5,  ’New'))])                      OLAPJoinRel(condition=[=($2,  $25)],  joinType=[left])                          OLAPJoinRel(condition=[AND(=($6,  $22),  =($2,  $17))],  joinType=[left])                              OLAPJoinRel(condition=[=($4,  $12)],  joinType=[left])                                  OLAPTableScan(table=[[DEFAULT,  TEST_KYLIN_FACT]],  fields=[[0,  1,  2,  3,   4,  5,  6,  7,  8,  9,  10,  11]])                                  OLAPTableScan(table=[[DEFAULT,  TEST_CAL_DT]],  fields=[[0,  1]])                              OLAPTableScan(table=[[DEFAULT,  test_category]],  fields=[[0,  1,  2,  3,  4,  5,   6,  7,  8]])                          OLAPTableScan(table=[[DEFAULT,  TEST_SITES]],  fields=[[0,  1,  2]]) Kylin Query Engine - Explain Plan
  • 58. Cube Optimization § Curse  of  Dimensionality   § N  dimension  cube  has  2N  cuboid   § Full  Cube  vs.  Parqal  Cube     § Huge  Data  Volume     § Dicqonary  Encoding   § Incremental  Building http://kylin.io
  • 59. § Full  Cube - Pre-aggregate all dimension combinations - “Curse of dimensionality”: N dimension cube has 2N cuboid. § ParOal  Cube - To avoid dimension explosion, we divide the dimensions into different aggregation groups - 2N+M+L à 2N + 2M + 2L - For cube with 30 dimensions, if we divide these dimensions into 3 group, the cuboid number will reduce from 1 Billion to 3 Thousands - 230 à 210 + 210 + 210 - Tradeoff between online aggregation and offline pre-aggregation http://kylin.io Full Cube vs. Partical Cube
  • 62. What’s Next § Improve  cube  algorithm   § Cube  by  segments,  30%-­‐50%  faster   § Build  delay  down  to  tens  of  minutes   § Streaming  cubing   § Analyze  real-­‐qme  data   § Build  delay  down  to  seconds   § Spark http://kylin.io
  • 63. Cube by Layer § The current algorithm - Many MRs, the number of dimensions - Huge shuffles, aggregation at reduce side, 100x of total cube size http://kylin.io Full  Data 0-­‐D  Cuboid 1-­‐D  Cuboid 2-­‐D  Cuboid 3-­‐D  Cuboid 4-­‐D  Cuboid MR MR MR MR MR
  • 64. Cube by Segments § The to-be algorithm, 30%-50% faster - 1 round MR - Reduced shuffles, map side aggregation, 20x total cube size - Hourly incremental build done in tens of minutes http://kylin.io Data  Split Cube  Segment Data  Split Cube  Segment Data  Split Cube  Segment …… Final  Cube Merge  Sort   (Shuffle) mapper mapper mapper
  • 65. Streaming Cubing § Cube is great but… - Cube takes time to build, how about real-time analysis? - Sometimes we want to drill down to row level information § Streaming cubing - Build micro cube segments from streaming - Use inverted index to capture last minute data http://kylin.io
  • 66. http://kylin.io Streaming seconds  delay Last  Hour Inverted   Index Before  Last  Hour Cube Kylin Lambda Architecture minutes  delay Query  Engine ANSI  SQL Hybrid Storage Interface
  • 68. Adding Spark Support § Cubing Efficiency § MR is not optimal framework § Spark Cubing Engine http://kylin.io
  • 69. Adding Spark Support § Cubing Efficiency § MR is not optimal framework § Spark Cubing Engine § Source from SparkSQL § Read data from SparkSQL instead of Hive http://kylin.io
  • 70. Adding Spark Support § Cubing Efficiency § MR is not optimal framework § Spark Cubing Engine § Source from SparkSQL § Read data from SparkSQL instead of Hive § Route to SparkSQL § Unsupported queries be coved by SparkSQL http://kylin.io
  • 71. Agenda §About Apache Kylin §Feature Highlights §Tech Highlights §Roadmap §Q&A http://kylin.io
  • 72. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013
  • 73. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013
  • 74. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013 Initial Sep,  2013
  • 75. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013 Initial Prototype  for   MOLAP   • Basic  end  to  end  POC   Sep,  2013 Jan,  2014
  • 76. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013 Initial Prototype  for   MOLAP   • Basic  end  to  end  POC   MOLAP   • Incremental  Refresh   • ANSI  SQL   • ODBC  Driver   • Web  GUI   • Tableau   • ACL   • Open  SourceSep,  2013 Jan,  2014 Oct,  2014
  • 77. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013 Initial Prototype  for   MOLAP   • Basic  end  to  end  POC   MOLAP   • Incremental  Refresh   • ANSI  SQL   • ODBC  Driver   • Web  GUI   • Tableau   • ACL   • Open  Source StreamingOLAP   • Streaming  OLAP   • JDBC  Driver   • New  UI   • Excel   • SparkSQL   • …  more   Sep,  2013 Jan,  2014 Oct,  2014 H1,  2015
  • 78. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013 Initial Prototype  for   MOLAP   • Basic  end  to  end  POC   MOLAP   • Incremental  Refresh   • ANSI  SQL   • ODBC  Driver   • Web  GUI   • Tableau   • ACL   • Open  Source StreamingOLAP   • Streaming  OLAP   • JDBC  Driver   • New  UI   • Excel   • SparkSQL   • …  more   Sep,  2013 Jan,  2014 Oct,  2014 H1,  2015 HybridOLAP   • Lambda  Arch   • Automation   • Capacity   Management   • Spark     • …  more
  • 79. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013 Initial Prototype  for   MOLAP   • Basic  end  to  end  POC   MOLAP   • Incremental  Refresh   • ANSI  SQL   • ODBC  Driver   • Web  GUI   • Tableau   • ACL   • Open  Source StreamingOLAP   • Streaming  OLAP   • JDBC  Driver   • New  UI   • Excel   • SparkSQL   • …  more   Sep,  2013 Jan,  2014 Oct,  2014 H1,  2015 HybridOLAP   • Lambda  Arch   • Automation   • Capacity   Management   • Spark     • …  more Next  Gen   • Adv  OLAP  Functions   • In-­‐Memory  Analysis   (TBD)   • Mobile  (TBD)   • …  more
  • 80. Future2015 2016 Kylin Evolution Roadmap http://kylin.io 20142013 Initial Prototype  for   MOLAP   • Basic  end  to  end  POC   MOLAP   • Incremental  Refresh   • ANSI  SQL   • ODBC  Driver   • Web  GUI   • Tableau   • ACL   • Open  Source StreamingOLAP   • Streaming  OLAP   • JDBC  Driver   • New  UI   • Excel   • SparkSQL   • …  more   TBD Sep,  2013 Jan,  2014 Oct,  2014 H1,  2015 HybridOLAP   • Lambda  Arch   • Automation   • Capacity   Management   • Spark     • …  more Next  Gen   • Adv  OLAP  Functions   • In-­‐Memory  Analysis   (TBD)   • Mobile  (TBD)   • …  more
  • 81. Kylin Ecosystem ■ Kylin Core ■ Fundamental framework of Kylin OLAP Engine ■ Extension ■ Plugins to support for additional functions and features ■ Integration ■ Lifecycle Management Support to integrate with other applications ■ Interface ■ Allows for third party users to build more features via user- interface atop Kylin core http://kylin.io Kylin OLAP Core Extension à Security à Redis Storage à Spark Engine à Docker Interface à Web Console à Customized BI à Ambari/Hue Plugin Integration à ODBC Driver à ETL à Drill à SparkSQL
  • 82. http://kylin.io If  you  want  to  go  fast,  go  alone.   If  you  want  to  go  far,  go  together. -­‐-­‐African  Proverb dev@kylin.incubator.apache.org