Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Osman Sarood

Infrastructure and Distributed Systems Lead, Mist Systems

Chunky Gupta 

Distributed Systems Engineer, Mist...
Mist Architecture
1 TB+
10 Billion+ Msgs
10’s TB+
500+ partitions
Mist Architecture
Live Aggregators: Real-time Aggregatio...
Acknowledgement
Amarinder Singh Bindra
Ebrahim SafaviJitendra Harlalka
• How do we aggregate?
• Live Aggregators architecture
• Autoscaling
• Multi-level Aggregations
Outline
Realtime Processing/Aggregation
What Live Aggregators is forYou?
What Live Aggregators is forYou? (contd ..)
Total Time Series: 2 # Aggregation Operations: 8
• View : A set of tuples which contain aggregated data for defined time interval based
on user-defined groupings
Terminologi...
Live Aggregators Architecture
LA Data Store
Process 1: Kafka ReaderProcess 2: Shared Memory Manager
Process 3: View Runner 1 Process 4: View Runner 2
Live Aggregators...
Component State
View 1 Running
View 2 Running
View 3 Running
Autoscaling : Live Aggregators Scheduler
LA Scheduler
View1 V...
Live Aggregators Scale
• Message consumption rate from Kafka : 25 Billion+ reads per day
~620k
Messages Per Sec
~480k
Mess...
Live Aggregators Scale (contd ..)
• Number of Time Series : 300 Million+ at peak times • Aggregation Operations : 2 Millio...
Live Aggregators Scale (contd ..)
• Memory Footprint : 2.5 TB+ at peak times • Writes to Cassandra : 4 Billion+ writes per...
Reliable?
Cost Effective?
Scalable?
Reliability 24*7
Spot Fleet
Controlled Chaos
(Stop and resume)
Uncontrolled Chaos
Spot MarketVolatility
800 Spot instances terminated in a single day! (more than our production DC)
Live Aggregators Controller
Lag = Timestamp of Most Recent Produced Msg - Timestamp of Last Msg LA processed
Msg # Offset ...
Fast Recovery After Failure
Dynamic Load (Trend vs Seasonality)
Daily Seasonality
Trend
Right Sizing
Best Fit
Live Aggregators Executor
Autoscaling : Live Aggregators Executor
0.8 cores 0.6 cores
0.2 cores 0.2 cores
1.8 cores
Component Cores
Kafka Reader 0.2...
LA Task 1
Autoscaling : Live Aggregators Scheduler
LA Scheduler
View1
0.8 cores
View2
0.6 cores
View3
0.9 cores
View Queue...
Lying Factor
Lying Factor = #Cores reserved - #Cores used
Lying Factor
Time0
Component
Evening Load
(Cores)
Kafka Reader 0...
• Lower Threshold = -0.05 Cores
• Upper Threshold = 0.20 Cores
Autoscaler: No Scaling
• Lower Threshold = -0.05 Cores
• Upper Threshold = 0.20 Cores
Autoscaler: Scale Up
Noisy Neighbor!!
Autoscaler: Scale Down
• Lower Threshold = -0.05 Cores
• Upper Threshold = 0.20 Cores
Autoscaling Effectiveness
• Resources UsedVs Reserved (Seasonality)
1000 cores
Multi Level Aggregation (Heatmap Example)
Device
Mist Office
• Each device location every
second to Kafka

• Client Density...
LA Task 2
Topic:1 partition: 1
Multi Level Aggregation
0 0 1 0
0 4 6 0
1 0 1 0
0 0 0 0
0 0 0 0
0 0 1 0
0 1 1 1
0 0 3 0
0 0...
Multi Level Aggregation: Client Density for a School
We will be adding the architecture diagram for this to explain
Future Work
1.Joining multiple streams

2.Instance specific resource allocation

3.Improving shared memory usage using Go

...
Rate today’s session
Thank You!
Nächste SlideShare
Wird geladen in …5
×

Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using Kafka (Chunky Gupta and Osman Sarood, Mist Systems) Kafka Summit NYC 2019

216 Aufrufe

Veröffentlicht am

In this session, we will discuss Live Aggregators (LA), Mist’s highly reliable and massively scalable in-house real time aggregation system that relies on Kafka for ensuring fault tolerance and scalability. LA consumes billions of messages a day from Kafka with a memory footprint of over 750 GB and aggregates over 100 million timeseries. Since it runs entirely on top of AWS spot instances, it is designed to be highly reliable. LA can recover from hours long complete EC2 outages using its checkpointing mechanism that depends on Kafka. This recovery mechanism recovers the checkpoint and replays messages from Kafka where it left off, ensuring no data loss. The characteristic that sets LA apart is its ability to autoscale by intelligently learning about resource usage and allocating resources accordingly. LA emits custom metrics that track resource usage for different components, i.e., Kafka consumer, shared memory manager and aggregator, to achieve server utilization of over 70%. We do multi-level aggregations in LA to intelligently solve load imbalance issues amongst different partitions for a Kafka topic. We’d demonstrate multi-level aggregation using an example in which we aggregate indoor location data coming from different organizations both spatially and temporally. We’d explain how changing partitioning key, along with writing intermediate data back to Kafka in a new topic for the next level aggregators helps Mist scale our solution. LA runs on top of 400+ cores, comprised of 10+ different Amazon EC2 spot instance types/sizes. We track the CPU usage for reading each Kafka stream on all the different instance types/sizes. We have several months of such data from our production Mesos cluster, which we are incorporating into LA’s scheduler to improve our server utilization and avoid CPU hot spots from developing on our cluster. Detailed Blog:https://www.mist.com/live-aggregators-highly-reliable-massively-scalable-real-time-aggregation-system/

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using Kafka (Chunky Gupta and Osman Sarood, Mist Systems) Kafka Summit NYC 2019

  1. 1. Osman Sarood
 Infrastructure and Distributed Systems Lead, Mist Systems Chunky Gupta 
 Distributed Systems Engineer, Mist Systems Cost Effectively and Reliably Aggregating Billions of Messages Per Day Using Apache Kafka®
  2. 2. Mist Architecture 1 TB+ 10 Billion+ Msgs 10’s TB+ 500+ partitions Mist Architecture Live Aggregators: Real-time Aggregation System 80% DC on Spot
 70% cheaper (reserved)
  3. 3. Acknowledgement Amarinder Singh Bindra Ebrahim SafaviJitendra Harlalka
  4. 4. • How do we aggregate? • Live Aggregators architecture • Autoscaling • Multi-level Aggregations Outline
  5. 5. Realtime Processing/Aggregation
  6. 6. What Live Aggregators is forYou?
  7. 7. What Live Aggregators is forYou? (contd ..) Total Time Series: 2 # Aggregation Operations: 8
  8. 8. • View : A set of tuples which contain aggregated data for defined time interval based on user-defined groupings Terminologies • Grouping Columns : Columns to consider as Aggregation keys • Aggregation Info : Type of aggregation, aggregation on what, etc • Time Series : Series of data points for a grouping cols in time order Sum Count Percentiles Median Average Distinct Count SpatialCount ?? 20+ Aggregation Types
  9. 9. Live Aggregators Architecture LA Data Store
  10. 10. Process 1: Kafka ReaderProcess 2: Shared Memory Manager Process 3: View Runner 1 Process 4: View Runner 2 Live Aggregators Executor Process 1: Kafka ReaderProcess 2: Shared Memory Manager Process 3: View Runner 1 Process 4: View Runner 2 Time Interval Org num_clients total_bytes_tx 00:00-00:10 Mist 1 100 Time Interval Org max_bytes_tx 00:00-00:10 Mist 100 Time Interval Org num_clients total_bytes_tx 00:00-00:10 Mist 2 160 Time Interval Org max_bytes_tx 00:00-00:10 Mist 100 View 1 State View 2 State View 1 State View 2 State Time Interval Org num_clients total_bytes_tx 00:00-00:10 Mist 2 160 Time Interval Org max_bytes_tx 00:00-00:10 Mist 100 View 1 State View 2 State Checkpoint Fetch Checkpoint S3 EC2 Spot Instances Msg# 1 Client: Sam Bytes_tx: 100 Org: Mist Msg# 1 Client: Sam Bytes_tx: 100 Org: Mist Process 2: Shared Memory Manager Msg# 2 Client: John Bytes_tx: 60 Org: Mist Msg# 2 Client: John Bytes_tx: 60 Org: Mist Msg# 3 Client: Ayaana Bytes_tx: 20 Org: Home
  11. 11. Component State View 1 Running View 2 Running View 3 Running Autoscaling : Live Aggregators Scheduler LA Scheduler View1 View2 View3 View Queue ZookeeperManager Task Manager LA Task 1 Component State View 1 Waiting View 2 Waiting View 3 Waiting View1 View1 View1 LA Task 1 View1 View1 View1 View 1: Partition 1 View 2: Partition 1 View 3: Partition 1 LA Task 1 View 1: Partition 1 View 2: Partition 1 View 3: Partition 2 LA Task 1 Component State View 1 Picked View 2 Picked View 3 Picked
  12. 12. Live Aggregators Scale • Message consumption rate from Kafka : 25 Billion+ reads per day ~620k Messages Per Sec ~480k Messages Per Sec
  13. 13. Live Aggregators Scale (contd ..) • Number of Time Series : 300 Million+ at peak times • Aggregation Operations : 2 Million+ at peak times
  14. 14. Live Aggregators Scale (contd ..) • Memory Footprint : 2.5 TB+ at peak times • Writes to Cassandra : 4 Billion+ writes per day
  15. 15. Reliable? Cost Effective? Scalable?
  16. 16. Reliability 24*7 Spot Fleet Controlled Chaos (Stop and resume) Uncontrolled Chaos
  17. 17. Spot MarketVolatility 800 Spot instances terminated in a single day! (more than our production DC)
  18. 18. Live Aggregators Controller Lag = Timestamp of Most Recent Produced Msg - Timestamp of Last Msg LA processed Msg # Offset Timestamp Lag (sec) 1 10 4:59:00 pm 60 2 11 4:59:30 pm 30 3 12 4:59:55 pm 5 4 13 5:00:00 pm 0
  19. 19. Fast Recovery After Failure
  20. 20. Dynamic Load (Trend vs Seasonality) Daily Seasonality Trend
  21. 21. Right Sizing Best Fit
  22. 22. Live Aggregators Executor
  23. 23. Autoscaling : Live Aggregators Executor 0.8 cores 0.6 cores 0.2 cores 0.2 cores 1.8 cores Component Cores Kafka Reader 0.2 Shared memory (per view) 0.1 View 1 0.8 View 2 0.6 View 3 0.9
  24. 24. LA Task 1 Autoscaling : Live Aggregators Scheduler LA Scheduler View1 0.8 cores View2 0.6 cores View3 0.9 cores View Queue ZookeeperManager Task Manager View1 0.8 cores View2 0.6 cores Core Available 2.01.80.90.2 LA Task 1 View1 0.8 cores View2 0.6 cores Component Cores Kafka Reader 0.2 Shared memory (per view) 0.1 View 1 0.8 View 2 0.6 View 3 0.9 Cores Reserved 1.8 Offer: 2 cores KR 0.2 cores SMM 0.1 cores SMM 0.2 cores KR 0.2 cores SMM 0.2 cores
  25. 25. Lying Factor Lying Factor = #Cores reserved - #Cores used Lying Factor Time0 Component Evening Load (Cores) Kafka Reader 0.2 Shared memory (per view) 0.1*2 View 1 0.8 View 2 0.6 Total Cores for LA Task 1.8 Reserved Cores 1.8 Lying Factor 0 High Load (Cores) 0.3 0.15 0.9 0.7 2.2 1.8 -0.4
  26. 26. • Lower Threshold = -0.05 Cores • Upper Threshold = 0.20 Cores Autoscaler: No Scaling
  27. 27. • Lower Threshold = -0.05 Cores • Upper Threshold = 0.20 Cores Autoscaler: Scale Up Noisy Neighbor!!
  28. 28. Autoscaler: Scale Down • Lower Threshold = -0.05 Cores • Upper Threshold = 0.20 Cores
  29. 29. Autoscaling Effectiveness • Resources UsedVs Reserved (Seasonality) 1000 cores
  30. 30. Multi Level Aggregation (Heatmap Example) Device Mist Office • Each device location every second to Kafka
 • Client Density Heatmap
 • Sharded by Client ID across multiple partitions
  31. 31. LA Task 2 Topic:1 partition: 1 Multi Level Aggregation 0 0 1 0 0 4 6 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 1 4 0 0 0 0 2 0 0 5 7 3 1 0 5 0 0 0 0 0 2 4 0 0 LA Task 1 Topic:1 partition: 0 LA Task 3 Topic:1 partition: 2 LA Task 4 Topic:2 partition: 2 Consume: Topic 1 Produce: Topic 2 Consume: Topic 2
  32. 32. Multi Level Aggregation: Client Density for a School We will be adding the architecture diagram for this to explain
  33. 33. Future Work 1.Joining multiple streams
 2.Instance specific resource allocation
 3.Improving shared memory usage using Go
 4.Dynamic rescheduling of views to improve Kafka load
  34. 34. Rate today’s session Thank You!

×