Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
From a Million to a Trillion
Events Per Day: Stream
Processing in Ludicrous Mode
Shrijeet Paliwal
@shrijeet
Kafka Summit N...
Tesla fleet
The quirks
•Physical system constraints limited resources, prolong
disconnectivity
•Skewed processing latencies seconds to...
Payload explosion
300/sec 3M/sec
100s powerpacks
10s pods/pack
1000s signals/sec/pod
Dimensions of growth
Ingest - the evolution
2015
Data-
Pump
- Full fleet

- Big messages

- 5M/day
2017
Kafka
Connect
- Canonical messages

- 50...
Ingest - needs & wants
•One stream One app vastly simpler operationally
•Isolated dependencies, resources, ownership
•Micr...
- Supreme composability
- Minimal and consistent APIs
- Back-pressure, buffering, transformations, failure recovery
Image s...
Avro - you say?
Expectation Reality
Raw to Canonical channel
FlowSource Sink
{ k : v }
Filter
Broadcast
FileSystem
Batch
Source
Database
Kafka Channel building blocks
KafkaToKafka
KafkaToDB
KafkaToFS
- Implement parser
- Define schema
- Configure the pipe
From whiteboard to production
•Kafka is very simple to monitor & observe
•Kafka dashboard(s) - most valuable self service tool
•Collect (all) > Plot (so...
Start with one alert!
ProducerSource Sink
x msg/sec y msg/sec
Kafka Availability SLI
•How fresh is my data?
•What is the ETA for lag recovery?
•Can we rollback the stream to 2pm?
…towards time based tooling!...
How fresh is my data?
Help me, Help you!
on-boarding checklist
Size compression, schema
Throughput msg/sec, req/sec
Rebalance rate gc, process l...
Dry-ops profiling
Disables all
external calls!
Generation Storage Consumption
Accelerate the world’s transition to
sustainable energy
Thank you!
From a Million to a Trillion Events Per Day: Stream Processing in Ludicrous Mode (Shrijeet Paliwal, Tesla) Kafka Summit NY...
Nächste SlideShare
Wird geladen in …5
×

From a Million to a Trillion Events Per Day: Stream Processing in Ludicrous Mode (Shrijeet Paliwal, Tesla) Kafka Summit NYC 2019

243 Aufrufe

Veröffentlicht am

In this talk we’ll describe the evolution of stream processing at Tesla and the challenges that are specific to our needs, such as large skews in message-processing latencies. We’ll describe how we built a reliable and performant ingestion platform that allows us to take an idea from a whiteboard to production in just a matter of hours. We’ll also discuss the design principles, tools, and incident response processes that have enabled a small team to support Kafka and downstream services in highly-available and multi-tenant environments at scale.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

From a Million to a Trillion Events Per Day: Stream Processing in Ludicrous Mode (Shrijeet Paliwal, Tesla) Kafka Summit NYC 2019

  1. 1. From a Million to a Trillion Events Per Day: Stream Processing in Ludicrous Mode Shrijeet Paliwal @shrijeet Kafka Summit NYC 2019
  2. 2. Tesla fleet
  3. 3. The quirks •Physical system constraints limited resources, prolong disconnectivity •Skewed processing latencies seconds to hours •Payload explosion 10000x
  4. 4. Payload explosion 300/sec 3M/sec
  5. 5. 100s powerpacks 10s pods/pack 1000s signals/sec/pod Dimensions of growth
  6. 6. Ingest - the evolution 2015 Data- Pump - Full fleet - Big messages - 5M/day 2017 Kafka Connect - Canonical messages - 50B/day 2012 - Subset of the fleet - 50K/day 2018 - 500B/day - Prolific adaption
  7. 7. Ingest - needs & wants •One stream One app vastly simpler operationally •Isolated dependencies, resources, ownership •Micro batching align flushes with commits •Scaling through multiple degrees of freedom
  8. 8. - Supreme composability - Minimal and consistent APIs - Back-pressure, buffering, transformations, failure recovery Image source: https://www.slideshare.net/kpciesielski/reactive-kafka-with-akka-streams Kafka Channels
  9. 9. Avro - you say?
  10. 10. Expectation Reality
  11. 11. Raw to Canonical channel FlowSource Sink { k : v }
  12. 12. Filter Broadcast FileSystem Batch Source Database Kafka Channel building blocks KafkaToKafka KafkaToDB KafkaToFS
  13. 13. - Implement parser - Define schema - Configure the pipe From whiteboard to production
  14. 14. •Kafka is very simple to monitor & observe •Kafka dashboard(s) - most valuable self service tool •Collect (all) > Plot (some) > Alert (start with one!) [de]Cluttered observability
  15. 15. Start with one alert! ProducerSource Sink x msg/sec y msg/sec Kafka Availability SLI
  16. 16. •How fresh is my data? •What is the ETA for lag recovery? •Can we rollback the stream to 2pm? …towards time based tooling! Moving off from the offsets…
  17. 17. How fresh is my data?
  18. 18. Help me, Help you! on-boarding checklist Size compression, schema Throughput msg/sec, req/sec Rebalance rate gc, process latency Resources cpu, memory, network
  19. 19. Dry-ops profiling Disables all external calls!
  20. 20. Generation Storage Consumption Accelerate the world’s transition to sustainable energy Thank you!

×