Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedIn

40 Aufrufe

Veröffentlicht am

For several years, LinkedIn has been using Kafka MirrorMaker as the mirroring solution for copying data between Kafka clusters across data centers. However, as LinkedIn data continued to grow, mirroring trillions of Kafka messages per day across data centers uncovered the scale limitations and operability challenges of Kafka MirrorMaker. To address these, we have developed a new mirroring solution, built on top our stream ingestion service, Brooklin. Brooklin’s mirroring solution aims to provide improved performance and stability, while facilitating better management via finer control of data pipelines. Through flushless Kafka produce, dynamic management of data pipelines, per-partition error handling and flow control, we are able to increase throughput, better withstand consume and produce failures and reduce overall operating costs. As a result, we have eliminated the major pain points of Kafka MirrorMaker.

In this talk, we will dive deeper into the challenges LinkedIn has faced with Kafka MirrorMaker, how we tackled them with Brooklin and our plans for iterating further on this new mirroring solution.

Veröffentlicht in: Präsentationen & Vorträge
  • Als Erste(r) kommentieren

More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedIn

  1. 1. More Data, More Problems: Scaling Kafka Mirroring Pipelines at LinkedIn Celia Kung Engineering Manager
  2. 2. Agenda Use Cases & Motivation Kafka MirrorMaker at LinkedIn Brooklin Mirroring Future
  3. 3. Use Cases & Motivation
  4. 4. • Aggregating data from all data centers • Moving data from offline data stores into online environments • Moving data between LinkedIn and external cloud services Use Cases
  5. 5. Motivation ● Kafka data at LinkedIn continues to grow rapidly
  6. 6. Motivation ● Kafka MirrorMaker (KMM) has not scaled well ● KMM is difficult to operate and maintain
  7. 7. Kafka MirrorMaker at LinkedIn
  8. 8. Kafka MirrorMaker at LinkedIn 100+ 9pipelines data centers
  9. 9. Kafka MirrorMaker at LinkedIn 100+clusters 6K+hosts 2T+messages/day
  10. 10. Topology Datacenter B aggregate tracking tracking KMM KMM Datacenter A aggregate tracking tracking KMM KMM • Each pipeline: ○ mirrors data from 1 source cluster to 1 destination cluster ○ constitutes its own KMM cluster
  11. 11. Topology Datacenter B aggregate tracking tracking Datacenter A aggregate tracking tracking KMM aggregate metrics metrics aggregate metrics metrics Datacenter C aggregate tracking tracking aggregate metrics metrics ... KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM ... ... ...
  12. 12. KMM does not scale well ● # of KMM clusters = (# of data centers)2 x # of Kafka clusters ● More consumer-producer pairs → need to provision more hardware
  13. 13. KMM is difficult to operate ● Static configuration file per KMM cluster ● Changes require deploying to 100+ clusters
  14. 14. Topology Datacenter B aggregate tracking tracking Datacenter A aggregate tracking tracking KMM aggregate metrics metrics aggregate metrics metrics Datacenter C aggregate tracking tracking aggregate metrics metrics ... KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM KMM ... ... ...
  15. 15. KMM is fragile ● Poor failure isolation ● Increased latency ● Unable to catch up with traffic
  16. 16. Brooklin Mirroring
  17. 17. Brooklin Mirroring ● Optimized for stability and operability ● Built on top of our streaming data pipelines service, Brooklin ● Brooklin Kafka mirroring has been in production for 1+ years ● Open-sourced Brooklin last month
  18. 18. Mirroring pipelines at LinkedIn 100+ 9pipelines data centers
  19. 19. Kafka MirrorMaker 100+clusters 6K+hosts 2T+messages/day
  20. 20. Kafka MirrorMaker 100+clusters 6K+hosts 2T+messages/day
  21. 21. Brooklin Mirroring 9clusters <2Khosts 2T+messages/day
  22. 22. Topology Datacenter B aggregate tracking tracking Datacenter A aggregate tracking tracking Brooklin • A single Brooklin cluster encompasses multiple pipelines ○ 1 cluster per data center Brooklin
  23. 23. Topology Datacenter A aggregate tracking tracking metrics aggregate metrics Datacenter B aggregate tracking tracking metrics aggregate metrics Datacenter C aggregate tracking tracking metrics aggregate metrics ...Brooklin BrooklinBrooklin
  24. 24. Brooklin Mirroring Architecture Brooklin
  25. 25. Kafka mirroring built on Brooklin DestinationsSources Messaging systems Microsoft EventHubs Messaging systems Microsoft EventHubs Databases Databases
  26. 26. Kafka mirroring built on Brooklin DestinationsSources Messaging systems Microsoft EventHubs Messaging systems Microsoft EventHubs Databases Databases
  27. 27. Kafka mirroring built on Brooklin
  28. 28. Kafka mirroring built on Brooklin Brooklin Engine Kafka src connector Kafka dest connector Management Rest API Diagnostics Rest API ZooKeeper Management/ monitoring portal SRE/op dashboards
  29. 29. Kafka mirroring built on Brooklin Brooklin Engine Kafka src connector Kafka dest connector Management Rest API Diagnostics Rest API ZooKeeper Management/ monitoring portal SRE/op dashboards
  30. 30. Dynamic Management Brooklin Engine Kafka src connector Kafka dest connector Management Rest API Diagnostics Rest API ZooKeeper Management/ monitoring portal SRE/op dashboards
  31. 31. Creating a pipeline Brooklin Engine Management Rest API ZooKeeper create POST /datastream name: mm_DC1-tracking_DC2-aggregate-tracking connectorName: KafkaMirrorMaker source: connectionString: kafkassl://DC1-tracking-vip:12345/topicA|topicB destination: connectionString: kafkassl://DC2-aggregate-tracking-vip:12345 metadata: num-streams: 5
  32. 32. Creating a pipeline ZooKeeper Brooklin host Brooklin host Brooklin host Brooklin host Brooklin host
  33. 33. Updating a pipeline Brooklin Engine Management Rest API ZooKeeper update PUT /datastream/mm_DC1-tracking_DC2-aggregate- tracking name: mm_DC1-tracking_DC2-aggregate-tracking connectorName: KafkaMirrorMaker source: connectionString: kafkassl://DC1-tracking-vip:12345/topicA|topicB|topicC|topicD destination: connectionString: kafkassl://DC2-aggregate-tracking-vip:12345 metadata: num-streams: 10
  34. 34. Updating a pipeline ZooKeeper Brooklin host Brooklin host Brooklin host Brooklin host Brooklin host
  35. 35. Dynamic Management Brooklin Engine Kafka src connector Kafka dest connector Management Rest API Diagnostics Rest API ZooKeeper Management/ monitoring portal SRE/op dashboards
  36. 36. On-demand Diagnostics Brooklin Engine Diagnostics Rest API ZooKeeper getAllStatus GET /diag?datastream=mm_DC1-tracking_DC2-aggregate-tracking host1.prod.linkedin.com: datastream: mm_DC1-tracking_DC2-aggregate-tracking assignedTopicPartitions: [topicA-0, topicA-3, topicB-0, topicB-2] autoPausedPartitions: [{topicA-3: {reason: SEND_ERROR, description: failed to produce messages from this partition}}] manuallyPausedPartitions: [] host2.prod.linkedin.com: datastream: mm_DC1-tracking_DC2-aggregate-tracking assignedTopicPartitions: [topicA-1, topicA-2, topicB-1, topicB-3] autoPausedPartitions: [] manuallyPausedPartitions: []
  37. 37. Error Isolation ● Manually pause and resume mirroring at every level ○ Entire pipeline, topic, topic-partition ● Brooklin can automatically pause mirroring of partitions ○ Auto-resumes the partitions after configurable duration ● Flow of messages from other partitions continue
  38. 38. Processing Loop while (!shutdown) { records = consumer.poll(); producer.send(records); if (timeToCommit) { producer.flush(); consumer.commit(); } }
  39. 39. Producer flush can be expensive while (!shutdown) { records = consumer.poll(); producer.send(records); if (timeToCommit) { producer.flush(); consumer.commit(); } }
  40. 40. Long Flush producer.flush() can take several minutes 200 100 0
  41. 41. Rebalance Storms consumers rebalance after max.poll.interval.ms 3 2 1 0
  42. 42. Increase max.poll.interval.ms? ● Reduces chances of consumer rebalance ● Risk detecting real failures late
  43. 43. Flushless Produce consumer.poll() → producer.send(records) → producer.flush() → consumer.commit()
  44. 44. Flushless Produce Only commit “safe” acknowledged checkpoints: consumer.poll() → producer.send(records) → consumer.commit(offsets)
  45. 45. Flushless Produce sp0 consumer producer checkpoint manager o1, o2 o1, o2 o1, o2 o1 o2 Source Destination ack(sp0, o2) dp0 dp1 ● Checkpoint manager maintains producer-acknowledged offsets for each source partition Source partition sp0 in-flight: [] acked: [] safe checkpoint: --
  46. 46. Flushless Produce sp0 consumer producer checkpoint manager o1, o2 o1, o2 o1, o2 o1 o2 Source Destination ack(sp0, o2) dp0 dp1 ● Checkpoint manager maintains producer-acknowledged offsets for each source partition Source partition sp0 in-flight: [o1, o2] acked: [] safe checkpoint: --
  47. 47. Flushless Produce sp0 consumer producer checkpoint manager o1, o2 o1, o2 o1, o2 o1 o2 Source Destination ack(sp0, o2) dp0 dp1 ● Checkpoint manager maintains producer-acknowledged offsets for each source partition Source partition sp0 in-flight: [o1, o2] acked: [] safe checkpoint: --
  48. 48. Flushless Produce sp0 consumer producer checkpoint manager o1, o2 o1, o2 o1, o2 o1 o2 Source Destination ack(sp0, o2) dp0 dp1 ● Checkpoint manager maintains producer-acknowledged offsets for each source partition Source partition sp0 in-flight: [o1, o2] acked: [] safe checkpoint: --
  49. 49. Flushless Produce sp0 consumer producer checkpoint manager o1, o2 o1, o2 o1, o2 o1 o2 Source Destination ack(sp0, o2) dp0 dp1 ● Checkpoint manager maintains producer-acknowledged offsets for each source partition Source partition sp0 in-flight: [o1, o2] acked: [] safe checkpoint: --
  50. 50. Flushless Produce sp0 consumer producer checkpoint manager o1, o2 o1, o2 o1, o2 o1 o2 Source Destination ack(sp0, o2) dp0 dp1 ● Checkpoint manager maintains producer-acknowledged offsets for each source partition Source partition sp0 in-flight: [o1] acked: [o2] safe checkpoint: --
  51. 51. Flushless Produce sp0 consumer producer checkpoint manager o1, o2 o1, o2 o1, o2 o1 o2 Source Destination ack(sp0, o2) dp0 dp1 ● Checkpoint manager maintains producer-acknowledged offsets for each source partition Source partition sp0 in-flight: [o1] acked: [o2] safe checkpoint: --
  52. 52. Flushless Produce sp0 consumer producer checkpoint manager o3, o4 o3, o4 o3, o4 o3 o4 Source Destination ack(sp0, o1) dp0 dp1 ● Update safe checkpoint to largest acknowledged offset that is less than oldest in-flight (if any) Source partition sp0 in-flight: [o3, o4] acked: [o1, o2] safe checkpoint:
  53. 53. Flushless Produce sp0 consumer producer checkpoint manager o3, o4 o3, o4 o3, o4 o3 o4 Source Destination ack(sp0, o1) dp0 dp1 ● Update safe checkpoint to largest acknowledged offset that is less than oldest in-flight (if any) Source partition sp0 in-flight: [o3, o4] acked: [] safe checkpoint: o2
  54. 54. Brooklin Mirroring Performance ● Use same consumer/producer configs as KMM ● Single host: 64 GB memory, 24 CPU (12 cores each) ● Metrics: ○ Throughput (output compressed bytes/sec) ○ Memory utilization % ○ CPU utilization %
  55. 55. Throughput
  56. 56. Memory Utilization
  57. 57. CPU Utilization
  58. 58. Brooklin Mirroring Performance ● Brooklin mirroring is CPU-bound ● Metrics (20 consumer-producer pairs): ○ Throughput: up to 28 MB/s ○ Memory utilization: 70% ○ CPU utilization: 97%
  59. 59. • 70%+ CPU time spent in decompression & re- compression ○ GZIPInputStream.read(): ~10% ○ GZIPOutputStream.write(): ~61% • Introduced “Passthrough” mirroring Performance
  60. 60. Future
  61. 61. • Rebalances cause drop in availability • Brooklin-controlled partition assignment & Kafka low-level consumer Stability
  62. 62. • Auto-scaling: adjust number of consumers based on throughput needs • Smarter (throughput-based) partition assignment Scalability
  63. 63. Thank you

×