Más contenido relacionado

Similar a Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022(20)

Más de StreamNative(20)


Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022

  1. Pulsar Summit San Francisco Hotel Nikko August 18 2022 Keynote Data Democracy: Journey to User-Facing Analytics Xiang Fu Co-Founder • StarTree
  2. About Me: Co-Founder at StarTree, cloud-native platform to build the next generation of data analytics applications for millions of users. Founder and PMC of Apache Pinot, a realtime, distributed OLAP datastore Previously, architect at Uber's data platform team solving streaming data serving, processing, and analytics problems at a large scale.
  3. Data Democracy - Are We There? SQL Editors Dashboard Internal Facing Analytics Operators Analysts Past — Present Current technology has done a great job delivering insights for INTERNAL USERS Analytical Data Apps latency-sensitive User Facing Analytics Present — Future Users Customers To truly democratize data, we need to deliver high quality insights to EXTERNAL USERS The Gap
  4. Vanishing window of opportunity for events Time Value of Data Value Time
  5. Event Insight Streaming Changed The Game… Data warehouses & lakes Hours to Days Stream Milliseconds to Seconds
  6. And Started A Cycle Streaming technologies like PULSAR increased speed and reduced costs to store events, kicking of a cycle… Collect more events Improved user experience Increase user engagement
  7. Streaming Sources Messaging Pub-sub Log Aggregation Streaming Processing Real Time Analytics Streaming Spawned New Use Cases
  8. - Ingest data as soon as events happen - Query that data as soon as it’s ingested - Do above at scale. In simple terms, we need to: How Do We Do Real-time Analytics ? Simple is HARD!
  9. Enter Apache Pinot
  10. Ingestion Sources (Real-time, Batch, SQL) Efficient compute and indexing powerhouse Compressed and scalable storage (PB scale) Advanced Query Support Multi-Tenant and Distributed Architecture Apache Pinot At A Glance
  11. 5000 queries/sec ~5ms average latency <100ms 95th percentile 2016 After Pinot 5,000 Queries / sec 700M+ members Before Pinot 1500 Queries / sec 200M+ members 2014 45X Improvement in Efficiency 1000 Nodes 75 Nodes Apache Pinot Impact
  12. 2013 2015 2019 2021 Started @ LinkedIn Apache Graduation StarTree Founded Open Source Apache Pinot Timeline
  13. 40+ Companies Slack Users 800 55k Downloads 100+ Companies Slack Users 2500+ 1M+ Downloads 2020 2022 Apache Pinot Community Growth
  14. Apache Pinot Adoption
  15. Apache Pinot Architecture
  16. Strong Integration with Pulsar
  17. INTERNAL FACING ANALYTICS USER FACING ANALYTICS Business Analysts Platform Operators Application Users Business Partners Food Delivery FinTech Long Orders Insights Nearby Orders in App, Restaurants Manager Dashboard Merchants Dashboard Ledger Observability Real Time Use Cases
  18. events/sec 1M+ queries/sec 200K+ query latency Ms data size 1PB+ rows 1T query latency < 1s data size 200TB+ queries/sec 30K+ query latency < 100ms Confidential - Do not duplicate or distribute without consent of StarTree Inc. Apache Pinot At Scale
  19. Democratizing data through User-Facing Analytics
  20. Who Viewed My Profile LinkedIn Publishing Analytics Platform
  21. Restaurant Manager Uber Orders Near You
  22. Contact Me: Thank you! Pulsar Summit San Francisco Hotel Nikko August 18 2022 @xiangfu0