SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
Pinot: Realtime OLAP for 530 Million Users
Seunghyun Lee
Software Engineer
Today’s agenda
1. Motivation
2. Architecture Overview
3. Scaling Pinot
4. Q&A
Analytics Use Case: Interactive Dashboard
select sum(pageView), time from T
where country = us,
browser = chrome,…
group by time
Slice and dice over arbitrary dimensions
Human driven queries
Use Case Response Latency Query Rate Possible Solutions
Interactive dashboard
sub-second to
few seconds
~1 qps Columnar Store
Analytics Use Case: Site Facing
select sum(pageView) from T
where memberId = 456,
pageKey = “profilePage”,
privacySettings in (…)
group by time,[title|geo|industry]
Pre-defined query format with different
primary key values
Use Case Response Latency Query Rate Possible Solutions
Site facing 100ms (99 percentile) 1000s qps KV Store
Analytics Use Case: Anomaly Detection
for d1 in [us, ca, … ]
for d2 in [chrome, ie, … ]
…
select sum(pageView), time from T
where country = d1, browser = d2
group by time
Identifying all issues requires us to monitor
all possible combinations
Periodic machine generated queries (bursty)
Use Case Response Latency Query Rate Possible Solutions
Anomaly Detection
sub-second to
few seconds
10-100s qps Streaming Engine
Use Case
Response
Latency
Query Rate
Possible
Solutions
Interactive
dashboard
sub-second to
few seconds
~1 qps
Columnar
Store
Site facing
100ms
(99 percentile)
1000s qps
KV Store
(pre-cube)
Anomaly
detection
sub-second to
few seconds
10-100s qps
Streaming
Engine
Same input data (Pageview)
Same OLAP style query
What makes these use cases use different solutions?
Different solutions based on
different workload
characteristics
Can we support all these use cases in one single system?
What is Pinot?
SQL-like interface with predictable latency (no joins)
Batch Data Ingestion (Hadoop)
Realtime Data Ingestion (Kafka)
Distributed, horizontally scalable
Open source! (https://github.com/linkedin/pinot)
Pinot @ LinkedIn
+50
Site Facing Use cases
+60k
Queries per second Records ingested
per second
+2000
Tables
+1.4m
• 300B documents
per data center
• 2 trillion documents
for internal use case
Today’s agenda
1. Motivation
2. Architecture Overview
3. Scaling Pinot
4. Q&A
Architecture Overview
• Controller - handles cluster-wide
coordination using Apace Helix and
Zookeeper
• Broker - handles query fan out and
query routing to servers
• Server - responds to query requests
originating from the brokers
Query Execution: Distributed
Broker
S1 S3 S2 S1 S3 S2
1. Query
2.Fetch routing table from Helix
4. Process request
& send response
5. Gather response
6. Return response
Server
3. Scatter request
Controller
(Helix)
Query Execution: Hybrid Querying
time
offline server
time
t = 1
realtime server
2 3 4 5
Query Execution: Hybrid Querying
time
1-2
offline server
time
t = 1
realtime server
2 3 4 5
offline Hadoop job
Query Execution: Hybrid Querying
time
1-2
offline server
time
t = 1
realtime server
2 3 4 5
Query Execution: Hybrid Querying
time
offline server
Broker
time
realtime server
Time boundary: 2
3 4 5 1-2t = 1 2
Query Execution: Hybrid Querying
time
offline server
Broker
time
realtime server
Time boundary: 2
3 4 5
select sum(m) from T
t = 1 2 1-2
Query Execution: Hybrid Querying
time
offline server
Broker
time
realtime server
Time boundary: 2
3 4 5
select sum(m) from T
where t <= 2
select sum(m) from T
where t > 2
select sum(m) from T
1-2t = 1 2
Query Execution: Single Node
Query Optimization
select max(col) from T Use metadata instead of scanning
select sum(metric) from T
where country = us and accountId = x
Reorders filter for better performance
(apply accountId before country predicate)
Dynamic query planning based on column metadata, index, and dictionary
Anatomy of Pinot Segment
Dictionary Forward Index
Metadata
start/end time
available indexes
partitioning info
min/max value
…
Inverted
Sorted
Startree
Indexes
docId country code
0 us 002
1 ca 001
2 jp 003
… … …
country
ca
jp
us
…
dictId docId
code
001
002
003
…
country
2
0
1
…
code
1
0
2
…
Raw Data
Today’s agenda
1. Motivation
2. Architecture Overview
3. Scaling Pinot
4. Q&A
Recap: Analytics Use Cases
Use Case
Response
Latency
Query Rate
Possible
Solutions
Interactive
dashboard
sub-second to
few seconds
~1 qps
Columnar
Store
Site facing
100ms
(99 percentile)
1000s qps
KV Store
(pre-cube)
Anomaly
detection
sub-second to
few seconds
10-100s qps
Streaming
Engine
Same input data (Pageview)
Same OLAP style query
Different solutions based on
different workload
characteristics
Interactive Dashboard
Use Case Response Latency Query Rate Possible Solutions
Interactive dashboard
sub-second to
few seconds
~1 qps Columnar Store
select sum(pageView), time from T
where country = us, browser = chrome,…
group by time
0 100 200 300 400 500
Latency (milliseconds)
Frequency
pinot
druid
Site Facing
Use Case Response Latency Query Rate Possible Solutions
Site facing 100ms (99 percentile) 1000s qps KV Store
select sum(pageView) from T
where memberId = xx, privacySettings in…
group by time,[title|geo|industry]
● ● ● ●
● ●
●●●●
● ● ● ●
● ●
●●●●
● ● ● ●
● ●
●● ●●100
1000
10000
10 1000
Queries per second
Latency(milliseconds)
druid
pinot
● ● ● ● ● ●●● ●●●●●
● ● ● ● ● ●●● ●●●●●
● ● ● ● ● ●●● ●●●●●
100
1000
10000
10 100 1000
Queries per second
Latency(milliseconds)
pinot
druid
Pinot Optimizations For Site Facing Use Cases
• Optimizing Query Processing
1. Sorted Index + Dynamic execution planning
• Optimizing Scatter and Gather
1. Smart segment assignment and routing
2. Data partitioning and pruning
Optimizing Query Processing: Sorted Index
• Access to both forward/inverted index
• Fetch contiguous block, benefit from locality
• For item filtering, pick scanning or inverted index based on cardinality of
sorted column
memberId
start
docId
end
docId
123 0 100
456 101 300
… … …
docId memberId
0 123
... …
100 123
101 456
… …
300 456
… …
select …
where memberId = 456, item in(…)
group by …
● ● ● ● ● ● ● ●
●
●
● ● ● ● ● ● ● ●
●
●
● ● ● ● ● ● ● ●
●
●
100
1000
10 100 1000
Queries per second
Latency(milliseconds)
sorted index
inverted index
Optimizing Scatter and Gather: Querying All Servers
Replica group: a set of servers that contains a complete set of all segments.
2 3
1 4
2 3
1 4
query 1
query 2
4 2
1 3
1 2
3 4
query 1
query 2
RG1
RG2
● ● ● ● ● ●●● ●●●●●
● ● ● ● ● ●●● ●●●●●
● ● ● ● ● ●●● ●●●●●
100
1000
10000
10 100 1000
Queries per second
Latency(milliseconds)
without routing
optimization
with routing
optimization
Problem Impact Solution
Querying all servers
99% is impacted by
the slowest server (e.g. gc)
Control the number of servers to fan-out
Optimizing Scatter and Gather: Querying All Segments
S1
S3
query 1
query 2
S2
S4
S1
(p=1)
S3
(p=2)
query 1
(mid = p1)
query 2
(mid = p2)
S2
(p=1)
S4
(p=2)
Problem Impact Solution
Querying all segments More CPU work on server
Minimize the number of segment
(partitioning and pruning)
select …
where memberId = 456, item in(…)
group by …
Anomaly Detection: Challenge
for d1 in [us, ca, …]
for d2 in [key1, key2,…]
…
select sum(pageViews) from T
where country=d1, page_key=d2,
source_app=d3, device_name=d4…
group by country, time
…
Filter Aggregation Latency
select …
where country = us,…
Slow, scan 60-70% data high
select …
where country = kenya,…
Scan less than 1% low
• Latency not predictable depends on the query predicate
• Monitoring all possible combinations makes the problem worse!
Time vs Space Trade-off
latency
storage requirement
Columnar Store
KV Store (Pre-computed)
Startree Index
variable latency
low storage overhead
low latency
high storage overhead
Startree Index Generation
1. Multidimensional sort
2. Split on the column and create a node
for each value
3. Create star node (aggregate metric after
removing the split column)
4. Apply 1,2,3 for each node recursively
and stop when number of records in
node < SplitThreshold
root
*
docId country browser
…
other
dimensions
impre
ssion
0 al ie 10
1 ca safari 10
2 … … …
… us chrome 10
… us chrome 10
… us ie 10
N us safari 10
Raw records
Aggregated records
N+1 * chrome 40
N+2 * ie 20
N+3 * safari 20
caal … us *country
browser chrome … safari
Time vs Space Trade-off with Startree
latency
storage requirement
Columnar Store
KV Store (Pre-computed)
Startree Index
SplitThreshold= infinity,
No prematerialization
SplitThreshold= 1,
Full materialiation
SplitThreshold= 100,000,
Partial data aware materialiation
Startree Query Execution
select sum(pageViews)from T
where country = AL
select sum(pageViews) from T
where browser = Chrome
select sum(pageViews) from T
select sum(X)
from T
where d1=v1 and d2=v2 and …
Any query pattern will scan
less than SplitThreshold records
root
*
caal … us *country
browser chrome … safari*chrome … safari
select sum(pageViews)from T
where country = CA
Raw docs
Aggregated docs
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
100
1000
10000
1 10 100
Queries per second
Latency(milliseconds)
Anomaly Detection
druid
pinot with
inverted index
pinot with
startree index
Use Case Response Latency Query Throughput Possible Solutions
Anomaly detection
sub-second to
few seconds
10-100s queries
per second
Streaming Engine
Pinot vs Druid
Druid Pinot
Inverted Index Always on all columns, fixed Configurable on per column basis
Query Execution Layer Fixed Plan Split into planning and execution
Data Organization N/A Sorted column
Partitioning
Only available for
time column
Available for any column
Controlling query fan-out N/A
Replica group based segment
assignment and routing
Smart pre-matrialization N/A Star-tree
Can we support all these use cases in one single system?
Use Case Response Latency Query Rate Solution
Interactive dashboard
sub-second to
few seconds
~1 qps Pinot
Site facing
100ms
(99 percentile)
1000s qps Pinot
Anomaly detection
sub-second to
few seconds
10-100s qps Pinot
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018
Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018

Weitere ähnliche Inhalte

Was ist angesagt?

Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for ExperimentationGleb Kanterov
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotFlink Forward
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Flink Forward
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...HostedbyConfluent
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...HostedbyConfluent
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastoreKishore Gopalakrishna
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberXiang Fu
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introductionchrislusf
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino ProjectMartin Traverso
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Altinity Ltd
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotXiang Fu
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and ThenSATOSHI TAGOMORI
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidImply
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 

Was ist angesagt? (20)

Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Pinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ UberPinot: Near Realtime Analytics @ Uber
Pinot: Near Realtime Analytics @ Uber
 
SeaweedFS introduction
SeaweedFS introductionSeaweedFS introduction
SeaweedFS introduction
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino Project
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Archmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on DruidArchmage, Pinterest’s Real-time Analytics Platform on Druid
Archmage, Pinterest’s Real-time Analytics Platform on Druid
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 

Ähnlich wie Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018

How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at NightHow Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at NightScyllaDB
 
Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics
Impatience is a Virtue: Revisiting Disorder in High-Performance Log AnalyticsImpatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics
Impatience is a Virtue: Revisiting Disorder in High-Performance Log AnalyticsBadrish Chandramouli
 
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Florian Lautenschlager
 
Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Florian Lautenschlager
 
Tutorial: The Role of Event-Time Analysis Order in Data Streaming
Tutorial: The Role of Event-Time Analysis Order in Data StreamingTutorial: The Role of Event-Time Analysis Order in Data Streaming
Tutorial: The Role of Event-Time Analysis Order in Data StreamingVincenzo Gulisano
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Naked Performance With Clojure
Naked Performance With ClojureNaked Performance With Clojure
Naked Performance With ClojureMetosin Oy
 
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14thSnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14thSnappyData
 
Become a GC Hero
Become a GC HeroBecome a GC Hero
Become a GC HeroTier1app
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language ClientSayyaparaju Sunil
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDoiT International
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About ShardingMongoDB
 
On the way to low latency (2nd edition)
On the way to low latency (2nd edition)On the way to low latency (2nd edition)
On the way to low latency (2nd edition)Artem Orobets
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra PerfectSATOSHI TAGOMORI
 
Netflix - Realtime Impression Store
Netflix - Realtime Impression Store Netflix - Realtime Impression Store
Netflix - Realtime Impression Store Nitin S
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...DataStax
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...DataWorks Summit/Hadoop Summit
 
Faceting optimizations for Solr
Faceting optimizations for SolrFaceting optimizations for Solr
Faceting optimizations for SolrToke Eskildsen
 

Ähnlich wie Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018 (20)

How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at NightHow Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
How Opera Syncs Tens of Millions of Browsers and Sleeps Well at Night
 
Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics
Impatience is a Virtue: Revisiting Disorder in High-Performance Log AnalyticsImpatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics
Impatience is a Virtue: Revisiting Disorder in High-Performance Log Analytics
 
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
 
Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017Chronix Poster for the Poster Session FAST 2017
Chronix Poster for the Poster Session FAST 2017
 
Tutorial: The Role of Event-Time Analysis Order in Data Streaming
Tutorial: The Role of Event-Time Analysis Order in Data StreamingTutorial: The Role of Event-Time Analysis Order in Data Streaming
Tutorial: The Role of Event-Time Analysis Order in Data Streaming
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Naked Performance With Clojure
Naked Performance With ClojureNaked Performance With Clojure
Naked Performance With Clojure
 
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14thSnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
SnappyData Ad Analytics Use Case -- BDAM Meetup Sept 14th
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
So you think you can stream.pptx
So you think you can stream.pptxSo you think you can stream.pptx
So you think you can stream.pptx
 
Become a GC Hero
Become a GC HeroBecome a GC Hero
Become a GC Hero
 
Aerospike Go Language Client
Aerospike Go Language ClientAerospike Go Language Client
Aerospike Go Language Client
 
Dataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data ProcessingDataflow - A Unified Model for Batch and Streaming Data Processing
Dataflow - A Unified Model for Batch and Streaming Data Processing
 
Everything You Need to Know About Sharding
Everything You Need to Know About ShardingEverything You Need to Know About Sharding
Everything You Need to Know About Sharding
 
On the way to low latency (2nd edition)
On the way to low latency (2nd edition)On the way to low latency (2nd edition)
On the way to low latency (2nd edition)
 
How to Make Norikra Perfect
How to Make Norikra PerfectHow to Make Norikra Perfect
How to Make Norikra Perfect
 
Netflix - Realtime Impression Store
Netflix - Realtime Impression Store Netflix - Realtime Impression Store
Netflix - Realtime Impression Store
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
 
Faceting optimizations for Solr
Faceting optimizations for SolrFaceting optimizations for Solr
Faceting optimizations for Solr
 

Kürzlich hochgeladen

➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 

Kürzlich hochgeladen (20)

➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 

Pinot: Realtime OLAP for 530 Million Users - Sigmod 2018

  • 1. Pinot: Realtime OLAP for 530 Million Users Seunghyun Lee Software Engineer
  • 2. Today’s agenda 1. Motivation 2. Architecture Overview 3. Scaling Pinot 4. Q&A
  • 3. Analytics Use Case: Interactive Dashboard select sum(pageView), time from T where country = us, browser = chrome,… group by time Slice and dice over arbitrary dimensions Human driven queries Use Case Response Latency Query Rate Possible Solutions Interactive dashboard sub-second to few seconds ~1 qps Columnar Store
  • 4. Analytics Use Case: Site Facing select sum(pageView) from T where memberId = 456, pageKey = “profilePage”, privacySettings in (…) group by time,[title|geo|industry] Pre-defined query format with different primary key values Use Case Response Latency Query Rate Possible Solutions Site facing 100ms (99 percentile) 1000s qps KV Store
  • 5. Analytics Use Case: Anomaly Detection for d1 in [us, ca, … ] for d2 in [chrome, ie, … ] … select sum(pageView), time from T where country = d1, browser = d2 group by time Identifying all issues requires us to monitor all possible combinations Periodic machine generated queries (bursty) Use Case Response Latency Query Rate Possible Solutions Anomaly Detection sub-second to few seconds 10-100s qps Streaming Engine
  • 6. Use Case Response Latency Query Rate Possible Solutions Interactive dashboard sub-second to few seconds ~1 qps Columnar Store Site facing 100ms (99 percentile) 1000s qps KV Store (pre-cube) Anomaly detection sub-second to few seconds 10-100s qps Streaming Engine Same input data (Pageview) Same OLAP style query What makes these use cases use different solutions? Different solutions based on different workload characteristics Can we support all these use cases in one single system?
  • 7. What is Pinot? SQL-like interface with predictable latency (no joins) Batch Data Ingestion (Hadoop) Realtime Data Ingestion (Kafka) Distributed, horizontally scalable Open source! (https://github.com/linkedin/pinot)
  • 8. Pinot @ LinkedIn +50 Site Facing Use cases +60k Queries per second Records ingested per second +2000 Tables +1.4m • 300B documents per data center • 2 trillion documents for internal use case
  • 9. Today’s agenda 1. Motivation 2. Architecture Overview 3. Scaling Pinot 4. Q&A
  • 10. Architecture Overview • Controller - handles cluster-wide coordination using Apace Helix and Zookeeper • Broker - handles query fan out and query routing to servers • Server - responds to query requests originating from the brokers
  • 11. Query Execution: Distributed Broker S1 S3 S2 S1 S3 S2 1. Query 2.Fetch routing table from Helix 4. Process request & send response 5. Gather response 6. Return response Server 3. Scatter request Controller (Helix)
  • 12. Query Execution: Hybrid Querying time offline server time t = 1 realtime server 2 3 4 5
  • 13. Query Execution: Hybrid Querying time 1-2 offline server time t = 1 realtime server 2 3 4 5 offline Hadoop job
  • 14. Query Execution: Hybrid Querying time 1-2 offline server time t = 1 realtime server 2 3 4 5
  • 15. Query Execution: Hybrid Querying time offline server Broker time realtime server Time boundary: 2 3 4 5 1-2t = 1 2
  • 16. Query Execution: Hybrid Querying time offline server Broker time realtime server Time boundary: 2 3 4 5 select sum(m) from T t = 1 2 1-2
  • 17. Query Execution: Hybrid Querying time offline server Broker time realtime server Time boundary: 2 3 4 5 select sum(m) from T where t <= 2 select sum(m) from T where t > 2 select sum(m) from T 1-2t = 1 2
  • 18. Query Execution: Single Node Query Optimization select max(col) from T Use metadata instead of scanning select sum(metric) from T where country = us and accountId = x Reorders filter for better performance (apply accountId before country predicate) Dynamic query planning based on column metadata, index, and dictionary
  • 19. Anatomy of Pinot Segment Dictionary Forward Index Metadata start/end time available indexes partitioning info min/max value … Inverted Sorted Startree Indexes docId country code 0 us 002 1 ca 001 2 jp 003 … … … country ca jp us … dictId docId code 001 002 003 … country 2 0 1 … code 1 0 2 … Raw Data
  • 20. Today’s agenda 1. Motivation 2. Architecture Overview 3. Scaling Pinot 4. Q&A
  • 21. Recap: Analytics Use Cases Use Case Response Latency Query Rate Possible Solutions Interactive dashboard sub-second to few seconds ~1 qps Columnar Store Site facing 100ms (99 percentile) 1000s qps KV Store (pre-cube) Anomaly detection sub-second to few seconds 10-100s qps Streaming Engine Same input data (Pageview) Same OLAP style query Different solutions based on different workload characteristics
  • 22. Interactive Dashboard Use Case Response Latency Query Rate Possible Solutions Interactive dashboard sub-second to few seconds ~1 qps Columnar Store select sum(pageView), time from T where country = us, browser = chrome,… group by time 0 100 200 300 400 500 Latency (milliseconds) Frequency pinot druid
  • 23. Site Facing Use Case Response Latency Query Rate Possible Solutions Site facing 100ms (99 percentile) 1000s qps KV Store select sum(pageView) from T where memberId = xx, privacySettings in… group by time,[title|geo|industry] ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ●● ●●100 1000 10000 10 1000 Queries per second Latency(milliseconds) druid pinot ● ● ● ● ● ●●● ●●●●● ● ● ● ● ● ●●● ●●●●● ● ● ● ● ● ●●● ●●●●● 100 1000 10000 10 100 1000 Queries per second Latency(milliseconds) pinot druid
  • 24. Pinot Optimizations For Site Facing Use Cases • Optimizing Query Processing 1. Sorted Index + Dynamic execution planning • Optimizing Scatter and Gather 1. Smart segment assignment and routing 2. Data partitioning and pruning
  • 25. Optimizing Query Processing: Sorted Index • Access to both forward/inverted index • Fetch contiguous block, benefit from locality • For item filtering, pick scanning or inverted index based on cardinality of sorted column memberId start docId end docId 123 0 100 456 101 300 … … … docId memberId 0 123 ... … 100 123 101 456 … … 300 456 … … select … where memberId = 456, item in(…) group by … ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 100 1000 10 100 1000 Queries per second Latency(milliseconds) sorted index inverted index
  • 26. Optimizing Scatter and Gather: Querying All Servers Replica group: a set of servers that contains a complete set of all segments. 2 3 1 4 2 3 1 4 query 1 query 2 4 2 1 3 1 2 3 4 query 1 query 2 RG1 RG2 ● ● ● ● ● ●●● ●●●●● ● ● ● ● ● ●●● ●●●●● ● ● ● ● ● ●●● ●●●●● 100 1000 10000 10 100 1000 Queries per second Latency(milliseconds) without routing optimization with routing optimization Problem Impact Solution Querying all servers 99% is impacted by the slowest server (e.g. gc) Control the number of servers to fan-out
  • 27. Optimizing Scatter and Gather: Querying All Segments S1 S3 query 1 query 2 S2 S4 S1 (p=1) S3 (p=2) query 1 (mid = p1) query 2 (mid = p2) S2 (p=1) S4 (p=2) Problem Impact Solution Querying all segments More CPU work on server Minimize the number of segment (partitioning and pruning) select … where memberId = 456, item in(…) group by …
  • 28. Anomaly Detection: Challenge for d1 in [us, ca, …] for d2 in [key1, key2,…] … select sum(pageViews) from T where country=d1, page_key=d2, source_app=d3, device_name=d4… group by country, time … Filter Aggregation Latency select … where country = us,… Slow, scan 60-70% data high select … where country = kenya,… Scan less than 1% low • Latency not predictable depends on the query predicate • Monitoring all possible combinations makes the problem worse!
  • 29. Time vs Space Trade-off latency storage requirement Columnar Store KV Store (Pre-computed) Startree Index variable latency low storage overhead low latency high storage overhead
  • 30. Startree Index Generation 1. Multidimensional sort 2. Split on the column and create a node for each value 3. Create star node (aggregate metric after removing the split column) 4. Apply 1,2,3 for each node recursively and stop when number of records in node < SplitThreshold root * docId country browser … other dimensions impre ssion 0 al ie 10 1 ca safari 10 2 … … … … us chrome 10 … us chrome 10 … us ie 10 N us safari 10 Raw records Aggregated records N+1 * chrome 40 N+2 * ie 20 N+3 * safari 20 caal … us *country browser chrome … safari
  • 31. Time vs Space Trade-off with Startree latency storage requirement Columnar Store KV Store (Pre-computed) Startree Index SplitThreshold= infinity, No prematerialization SplitThreshold= 1, Full materialiation SplitThreshold= 100,000, Partial data aware materialiation
  • 32. Startree Query Execution select sum(pageViews)from T where country = AL select sum(pageViews) from T where browser = Chrome select sum(pageViews) from T select sum(X) from T where d1=v1 and d2=v2 and … Any query pattern will scan less than SplitThreshold records root * caal … us *country browser chrome … safari*chrome … safari select sum(pageViews)from T where country = CA Raw docs Aggregated docs
  • 33. ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 100 1000 10000 1 10 100 Queries per second Latency(milliseconds) Anomaly Detection druid pinot with inverted index pinot with startree index Use Case Response Latency Query Throughput Possible Solutions Anomaly detection sub-second to few seconds 10-100s queries per second Streaming Engine
  • 34. Pinot vs Druid Druid Pinot Inverted Index Always on all columns, fixed Configurable on per column basis Query Execution Layer Fixed Plan Split into planning and execution Data Organization N/A Sorted column Partitioning Only available for time column Available for any column Controlling query fan-out N/A Replica group based segment assignment and routing Smart pre-matrialization N/A Star-tree
  • 35. Can we support all these use cases in one single system? Use Case Response Latency Query Rate Solution Interactive dashboard sub-second to few seconds ~1 qps Pinot Site facing 100ms (99 percentile) 1000s qps Pinot Anomaly detection sub-second to few seconds 10-100s qps Pinot