Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Data Transformations on Operational Metrics using Kafka Streams
Vidhya Ramachandran
Mukund Murthy
About Priceline
Priceline.com is part of the
Booking Holdings Group. The
Booking Holdings Group is
the world leader in onl...
About Us
Vidhya Ramachandran Mukund Murthy
Director Engineering, Common Platform Services Software Engineer , Common Platf...
Legacy Monitoring System
Legacy Architecture
Custom Alerting System
Custom Dashboards
Application Servers
Kafka - The Only Constant!
Current Architecture
Motive behind going with Kafka & Kafka Streams
Kafka Infrastructure & new Monitoring Solution & other Sinks
Config Reader Event Listener
Applications
Message Router
Embe...
Priceline Data Collection Console
Priceline Streams Multiplexing
• Stream = Events emitted by some source (for e.g. a Priceline Application)
• Multiple Stre...
Priceline Data Collection Console
Streams in Splunk before & after transformations
License Reduction from Transformations
Operational metrics data flowing through Kafka are converted to Pipe
separated form...
Transformations for PCI/PII
Statsd Conversion
Topology – Kafka Streams
Windowed Events
Custom Partitioner
Custom TimeStamp Extractor
License Reduction from Summarizations
Late Arriving Events
Windowed Event Entries
| T1 | T2 | T3 | ……
|1 2 3 |5 6 7 |9 10 11 |
--------> 4 (T1)
--------- > 8 (T...
Processing too long…
Measure & make sure to keep your Processors simple and
straightforward
Maintain SLA in Monitoring sys...
Exception Handling
Gotchas…
Testing Kafka Streams
• Unit testing individual
processors and transformers
• Integration testing with an
embedded Kafka c...
Debugging / Testing
28
• Splunk Add-on to Poll local or remote JMX management Servers
• Index MBean attributes, outputs from MBean operations,...
Kafka Monitoring - Dashboards
Kafka Monitoring - Alerts
Monitoring Kafka Streams
• Thread metrics
• Average time for commits, poll, process operations
• Tasks created per second,...
Conclusion & Next Steps
Acknowledgements
• Our Core Platform team for their contributions to the Streaming library
and Data Collection Console
• T...
We are Hiring!!! https://careers.priceline.com/
Questions
Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandran, Priceline.com) Kafka Summit NYC 2019
Nächste SlideShare
Wird geladen in …5
×

Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandran, Priceline.com) Kafka Summit NYC 2019

153 Aufrufe

Veröffentlicht am

How Priceline uses Kafka Streams technology to effectively save TBs on daily licenses of our monitoring systems. Kafka Streams powers a big part of our analytics and monitoring pipelines and delivers operational metrics transformations in real time. All logs and operational metrics from all of the APIs of Priceline’s products flow into Kafka and is ingested into our Monitoring System Splunk for Alerting and Monitoring. We have now implemented data transformations, aggregations and summarizations using Kafka Streams technologies to effectively eliminate PCI/PII violations on the log data; do aggregations on metrics to avoid ingesting sub-second metrics and ingest metrics only at the granularity that we need to. We will cover the need for custom Serdes, custom partitioners, and why we don’t use the confluent registry. You will also learn how Priceline uses a self service model to configure its streams, topics and consumers using Data Collection Console, which is our UI for managing the Kafka streaming pipelines.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandran, Priceline.com) Kafka Summit NYC 2019

  1. 1. Data Transformations on Operational Metrics using Kafka Streams Vidhya Ramachandran Mukund Murthy
  2. 2. About Priceline Priceline.com is part of the Booking Holdings Group. The Booking Holdings Group is the world leader in online travel & related services. The Group’s mission is to help people experience the world.
  3. 3. About Us Vidhya Ramachandran Mukund Murthy Director Engineering, Common Platform Services Software Engineer , Common Platform Services
  4. 4. Legacy Monitoring System
  5. 5. Legacy Architecture Custom Alerting System Custom Dashboards Application Servers
  6. 6. Kafka - The Only Constant!
  7. 7. Current Architecture
  8. 8. Motive behind going with Kafka & Kafka Streams
  9. 9. Kafka Infrastructure & new Monitoring Solution & other Sinks Config Reader Event Listener Applications Message Router Embedded Agent Asynchronous Message Dispatcher Kafka Consumer Cluster Infrastructure Code embedded in the Products to Produce Events to the Kafka Topic Splunk HEC Add other sinks Data Collection Console
  10. 10. Priceline Data Collection Console
  11. 11. Priceline Streams Multiplexing • Stream = Events emitted by some source (for e.g. a Priceline Application) • Multiple Streams can come from the same Application and be written into the same topic • A stream can be emitted by multiple Applications Stream 1 Stream 3 Stream 2 Topic
  12. 12. Priceline Data Collection Console
  13. 13. Streams in Splunk before & after transformations
  14. 14. License Reduction from Transformations Operational metrics data flowing through Kafka are converted to Pipe separated format and the keys are applied back as search time extractions.
  15. 15. Transformations for PCI/PII
  16. 16. Statsd Conversion
  17. 17. Topology – Kafka Streams
  18. 18. Windowed Events
  19. 19. Custom Partitioner
  20. 20. Custom TimeStamp Extractor
  21. 21. License Reduction from Summarizations
  22. 22. Late Arriving Events Windowed Event Entries | T1 | T2 | T3 | …… |1 2 3 |5 6 7 |9 10 11 | --------> 4 (T1) --------- > 8 (T2) Key Value T1 1,2,3 T1 1,2,3,4 T2 5,6,7 T2 5,6,7,8 T3 9,10,11
  23. 23. Processing too long… Measure & make sure to keep your Processors simple and straightforward Maintain SLA in Monitoring system < 5seconds from Event Time…
  24. 24. Exception Handling
  25. 25. Gotchas…
  26. 26. Testing Kafka Streams • Unit testing individual processors and transformers • Integration testing with an embedded Kafka cluster, multiple instances • Integration Testing the aggregation by repeating the concatenated raw messages in the final aggregated event
  27. 27. Debugging / Testing
  28. 28. 28 • Splunk Add-on to Poll local or remote JMX management Servers • Index MBean attributes, outputs from MBean operations, and MBean notifications. • Configurable Templates to selectively monitor JMX stats Kafka Monitoring – JMX Stats
  29. 29. Kafka Monitoring - Dashboards
  30. 30. Kafka Monitoring - Alerts
  31. 31. Monitoring Kafka Streams • Thread metrics • Average time for commits, poll, process operations • Tasks created per second, tasked closed per second • Task metrics • Average number of commits per second • Average commit time • Processor node metrics • Average and max processing time • Average number of process operations per second Forward rate • State store metrics • Average execution time for put, get, and flush operations • Average number put, get, and flush operations per second
  32. 32. Conclusion & Next Steps
  33. 33. Acknowledgements • Our Core Platform team for their contributions to the Streaming library and Data Collection Console • To Confluent support and sales team
  34. 34. We are Hiring!!! https://careers.priceline.com/
  35. 35. Questions

×