Pinot
Kishore Gopalakrishna
Tuesday, August 18, 15
Agenda
• Pinot @ LinkedIn - Current
• Pinot - Architecture
• Pinot Operations
• Pinot @ LinkedIn - Future
Tuesday, August ...
WVMP
Tuesday, August 18, 15
Slice and Dice Metrics
Tuesday, August 18, 15
Pinot @ LinkedIn
Customers Members Internal tools
Tuesday, August 18, 15
• 100B documents
• 1B documents ingested per day
• 100M queries per day
• 10’s of ms latency
• 30 tables in prod, 250 * 3 ...
Key features
SQL-like
interface
Columnar
storage and
indexing
Real-time
data load
Tuesday, August 18, 15
(S)QL: Filters and Aggs
SELECT count(*)
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
'day' >= 15949 AND ...
(S)QL: Group By
SELECT count(*)
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
'day' >= 15949 AND 'day' <=...
(S)QL: ORDER BY and LIMIT
SELECT *
FROM companyFollowHistoricalEvents
WHERE entityId = 121011 AND
entityId = 1000 AND
acti...
Whats not supported
• JOIN: unpredictable performance
• NOT A SOURCE OF TRUTH
• Mutation
Tuesday, August 18, 15
Pinot
• Data flow
• Query Execution
• How to use/operate
• Pinot @ LinkedIn - Future
Tuesday, August 18, 15
Broker Helix
Real
time
Historical
Kafka Hadoop
Pinot
Architecture
Queries
Raw
Data
Tuesday, August 18, 15
Pinot
• Pinot segments
Tuesday, August 18, 15
Pinot Segment layout: Columnar storage
Tuesday, August 18, 15
Pinot Segment layout: Sorted Forward Index
Tuesday, August 18, 15
Pinot Segment layout: Other techniques
• Indexes: Inverted index, Bitmap, RoaringBitmap
• Compression: Dictionary Encoding...
Data aware
pre-computation
Star tree Index
Tuesday, August 18, 15
Pinot
• Query Execution
Tuesday, August 18, 15
Pinot Query Execution: Distributed
Servers
S1
S3
S2
S1
S3
S2
Helix
Brokers
Tuesday, August 18, 15
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
Brokers
Tuesday, August 18, 15
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
Tuesda...
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
3. Sca...
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
3. Sca...
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
3. Sca...
Pinot Query Execution: Distributed
Servers
1.Query
S1
S3
S2
S1
S3
S2
Helix
2. Fetch routing table from HelixBrokers
3. Sca...
Pinot Query Execution: Single Node Architecture
EXECUTION ENGINE
INVERTED
INDEX
BITMAP
INDEX
COLUMN FORMAT
PLANNER
Tuesday...
Pinot Query Execution: Single Node Architecture
SELECT
campaignId,
sum(clicks)
FROM Table A
WHERE
accountId = 121011
AND
'...
Pinot
• Operations
Tuesday, August 18, 15
Cluster Management: Deployment
Helix
Brokers
Servers
• Brokers and Servers register themselves in Helix
• All servers star...
On boarding new use case
Helix
Brokers
Servers
XLNT XLNT
XLNT
Create Table
command
Controller
XLNT
XLNTTag
Servers
TableNa...
Segment Assignment
Servers
S3
S2
S1
Upload Segment S2
S1
S3
S2
S1
S3
Helix
Brokers
Copies
TableName
2
XLNT_T1
Controller
T...
• AUTO recovery mode: Automatically redistribute
segments on failure/addition of new nodes
• Custom mode: Run in degraded ...
Pinot vs Druid
Druid Pinot
Architecture
Realtime + Offline,
Realtime only
Realtime + Offline
Realtime only -> consistency is...
• Documentation & tooling
• In progress - consistency among real time replicas.
• Improve cost to serve - leverage SSD, pa...
Thank You
30
Tuesday, August 18, 15
Nächste SlideShare
Wird geladen in …5
×

Pinot: Realtime Distributed OLAP datastore

363.171 Aufrufe

Veröffentlicht am

Pinot is a realtime distributed OLAP datastore, which is used at LinkedIn to deliver scalable real time analytics with low latency. It can ingest data from offline data sources (such as Hadoop and flat files) as well as online sources (such as Kafka). Pinot is designed to scale horizontally.

Veröffentlicht in: Technologie
100 Kommentare
812 Gefällt mir
Statistik
Notizen
  • I am very beautifulhttp://www.dinihaber.com/diyanet-haber/diyanet-in-imamlari-aklandi-h35.html
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • This is Dr. Wakina’s season, the season he restores joy, peace and prosperity to broken relationships/marriages. He is the one and only spiritual healing doctor that has the power to do unimaginable things with his email:- dr.wakinalovetemple@gmail.com. With big faith and trust I have on Doctor I got results 7days after some inspirational spell cast that brought my husband back and made him realize his mistake for leaving us because of misunderstanding we usually had because of his affairs with his concubine. He realized after the spell that no woman is prettier than me or any woman that can love, care and satisfy him like I do. He realized he has a beautiful and promising family. I was not afraid contacting Dr. Wakina because I know he is one of the genuine spell doctors that can bring back my better half without negative effects. I am exited you read this wonderful testimony from a happy wife and mother. I pray and wish unending joy, peace and prosperity in your relationship/marriage… cheers!
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • How can I get a personal loan if I have a bad credit score? - https://www.linkedin.com/pulse/how-can-i-get-personal-loan-have-bad-credit-score-mantha/
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • My name is jennifer smith, i live and work in Oxfordshire, UK. My life is back!!! After 2 years of Broken marriage, my husband left me with two kids, I felt like ending it all, i almost committed suicide because he left us with nothing, i was emotionally down all this while. Thanks to a spell caster called Dr. klin of Ultimate spell cast which i met online. On one faithful day, as I was browsing through the internet, I came across several of testimonies about this particular spell caster. Some people testified that he brought their Ex lover back, some testified that he restores womb,cure cancer,and other sickness, you can contact him on ( klinspelltemple@gmail.com or klinspelltemple@gmail.com) or call him +13473807899.you can also view his web site: http://klinspelltemple@gmail.com ....he is the best caster that can help you with your problems. whatsapp +2347059014517
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • I AM SO HAPPY DR TRUST, MY HUSBAND IS BACK. I completely trusted Dr Trust totally from the time I spoke with him during the period my husband Left me after 11 years of our marriage, He started the spell work on my husband, and gave me so much assurance and guaranteed me that he was going to bring my husband back to my feet in just 48 hours of the spell casting. I was so confident in his work and just as he said in the beginning, my husband is finally back to me again, yes he is back with all his hearts, Love, care, emotions and flowers and things are better now. I would have no hesitation to recommend this powerful spell caster to anybody who is in need of help.. E-mail Ultimatespellcast@gmail.com or Ultimatespellcast@gmail.com his website: https://utimatespellcaster.com WhatApp or call him +2348156885231 Do accept my gratitude sir.
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
Keine Downloads
Aufrufe
Aufrufe insgesamt
363.171
Auf SlideShare
0
Aus Einbettungen
0
Anzahl an Einbettungen
3.546
Aktionen
Geteilt
0
Downloads
1.901
Kommentare
100
Gefällt mir
812
Einbettungen 0
Keine Einbettungen

Keine Notizen für die Folie

Pinot: Realtime Distributed OLAP datastore

  1. Pinot Kishore Gopalakrishna Tuesday, August 18, 15
  2. Agenda • Pinot @ LinkedIn - Current • Pinot - Architecture • Pinot Operations • Pinot @ LinkedIn - Future Tuesday, August 18, 15
  3. WVMP Tuesday, August 18, 15
  4. Slice and Dice Metrics Tuesday, August 18, 15
  5. Pinot @ LinkedIn Customers Members Internal tools Tuesday, August 18, 15
  6. • 100B documents • 1B documents ingested per day • 100M queries per day • 10’s of ms latency • 30 tables in prod, 250 * 3 std app nodes Pinot @ LinkedIn Tuesday, August 18, 15
  7. Key features SQL-like interface Columnar storage and indexing Real-time data load Tuesday, August 18, 15
  8. (S)QL: Filters and Aggs SELECT count(*) FROM companyFollowHistoricalEvents WHERE entityId = 121011 AND 'day' >= 15949 AND 'day' <= 15963 AND paid = 'y’ AND action = 'stop' Tuesday, August 18, 15
  9. (S)QL: Group By SELECT count(*) FROM companyFollowHistoricalEvents WHERE entityId = 121011 AND 'day' >= 15949 AND 'day' <= 15963 AND paid = 'y’ GROUP BY action Tuesday, August 18, 15
  10. (S)QL: ORDER BY and LIMIT SELECT * FROM companyFollowHistoricalEvents WHERE entityId = 121011 AND entityId = 1000 AND action = 'start' ORDER BY creationTime DESC LIMIT 1 Tuesday, August 18, 15
  11. Whats not supported • JOIN: unpredictable performance • NOT A SOURCE OF TRUTH • Mutation Tuesday, August 18, 15
  12. Pinot • Data flow • Query Execution • How to use/operate • Pinot @ LinkedIn - Future Tuesday, August 18, 15
  13. Broker Helix Real time Historical Kafka Hadoop Pinot Architecture Queries Raw Data Tuesday, August 18, 15
  14. Pinot • Pinot segments Tuesday, August 18, 15
  15. Pinot Segment layout: Columnar storage Tuesday, August 18, 15
  16. Pinot Segment layout: Sorted Forward Index Tuesday, August 18, 15
  17. Pinot Segment layout: Other techniques • Indexes: Inverted index, Bitmap, RoaringBitmap • Compression: Dictionary Encoding, P4Delta • Multi Valued columns, skip lists, • Hyperloglog for unique • T-digest for Percentile, Quantile Tuesday, August 18, 15
  18. Data aware pre-computation Star tree Index Tuesday, August 18, 15
  19. Pinot • Query Execution Tuesday, August 18, 15
  20. Pinot Query Execution: Distributed Servers S1 S3 S2 S1 S3 S2 Helix Brokers Tuesday, August 18, 15
  21. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix Brokers Tuesday, August 18, 15
  22. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers Tuesday, August 18, 15
  23. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers 3. Scatter Request Tuesday, August 18, 15
  24. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers 3. Scatter Request 4. Process Request & send response Tuesday, August 18, 15
  25. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers 3. Scatter Request 4. Process Request & send response 5. Gather Response Tuesday, August 18, 15
  26. Pinot Query Execution: Distributed Servers 1.Query S1 S3 S2 S1 S3 S2 Helix 2. Fetch routing table from HelixBrokers 3. Scatter Request 4. Process Request & send response 5. Gather Response 6. Return Response Tuesday, August 18, 15
  27. Pinot Query Execution: Single Node Architecture EXECUTION ENGINE INVERTED INDEX BITMAP INDEX COLUMN FORMAT PLANNER Tuesday, August 18, 15
  28. Pinot Query Execution: Single Node Architecture SELECT campaignId, sum(clicks) FROM Table A WHERE accountId = 121011 AND 'day' >= 15949 GROUP BY campaignId account Id daycampaign Id click Filter Operator Projection Operator Aggregation Group by Operator Combine Operator Pinot Segments Data sources Matching doc ids campaignId,Click tuple Tuesday, August 18, 15
  29. Pinot • Operations Tuesday, August 18, 15
  30. Cluster Management: Deployment Helix Brokers Servers • Brokers and Servers register themselves in Helix • All servers start with no use case specific configuration Controller Tuesday, August 18, 15
  31. On boarding new use case Helix Brokers Servers XLNT XLNT XLNT Create Table command Controller XLNT XLNTTag Servers TableName Brokers 3 XLNT_T1 1 Tuesday, August 18, 15
  32. Segment Assignment Servers S3 S2 S1 Upload Segment S2 S1 S3 S2 S1 S3 Helix Brokers Copies TableName 2 XLNT_T1 Controller Tuesday, August 18, 15
  33. • AUTO recovery mode: Automatically redistribute segments on failure/addition of new nodes • Custom mode: Run in degraded mode until node is restarted/replaced. Pinot - Fault tolerance/Elasticity Tuesday, August 18, 15
  34. Pinot vs Druid Druid Pinot Architecture Realtime + Offline, Realtime only Realtime + Offline Realtime only -> consistency is hard and schema evolution/Bootstrap is hard Inverted Index Always On all columns, Fixed Configurable on per column basis Allows trade off between scanning v/s inverted index + scanning. More data can be fit in given memory size Data organization N/A Sorts data Organizing data provides speed/better compression and removes the need for inverted index Smart pre- materialization N/A star-tree Allows trade off between latency and space Query Execution Layer Fixed Plan Split into Planning and execution Smart choices can be made at runtime based on metadata/query. Tuesday, August 18, 15
  35. • Documentation & tooling • In progress - consistency among real time replicas. • Improve cost to serve - leverage SSD, partial pre materialization • ThirdEye - Business Metrics Monitoring Pinot - Future Tuesday, August 18, 15
  36. Thank You 30 Tuesday, August 18, 15

×