This document provides a summary of distributed tracing in event-driven architectures using OpenTelemetry and Kafka. It discusses what distributed tracing is, the components of a distributed tracing system, and how OpenTelemetry can be used to instrument Kafka clients, Kafka Streams, and Kafka Connect. Specifically, it describes how the OpenTelemetry Java agent can be used to automatically instrument Kafka clients and stateless Kafka Streams processing. For stateful Kafka Streams processing and Kafka Connect, it discusses challenges and potential solutions around issues like caching and state management.
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
A Practical Guide To End-to-End Tracing In Event Driven Architectures with Roman Kolesnev
1. A Practical Guide To End-to-End Tracing
In Event Driven Architectures
2. Roman - UK developer at PIE Labs
PIE Labs, Confluent
• What is Distributed Tracing ?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• Kafka Connect instrumentation
• Summary
Who we are, what we’ll talk about…
3. What is Distributed Tracing (DT)?
https://www.altexsoft.com/blog/shipment-tracking-integration-apis-edis-carriers-aggregators/
Systems get
complex…
4. Components of a DT system
• Instrumentation
• Collection
• Visualisation
https://blog.gurock.com/distributed-tracing
/
5. What makes up a trace?
https://docs.logz.io/user-guide/distributed-tracing/what-is-tracing
9. Why Distributed Tracing?
Adding context to the message and process flow.
• Dependency graph
• Record of Event flow
• Log correlation
• Contextual metrics
• Answer questions like:
“This result looks weird. Show me all the intermediate
states, so I can debug where the weirdness started…”
10. Section: What is OpenTelemetry?
• What is Distributed Tracing (DT)?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• Kafka Connect instrumentation
11. Overview of OpenTelemetry
1. Standardised, vendor-agnostic
2. High-quality, ubiquitous, and portable
3. Collection of tools, APIs, and SDKs
4. Instrument, generate
5. Collect, and export
6. To an observability back-end (not OT - e.g. Jaeger)
7. To help you analyze your software’s performance and behavior
12. Support for Kafka in OpenTelemetry
Kafka Clients:
• Javaagent
• Tracing Wrappers
• Tracing Interceptors
Kafka Streams:
• Javaagent
• Supply Kafka Clients with Tracing
OpenTelmetry instrumentation
13. Javaagent - Auto Instrumentation Agent
• Aspect Oriented approach
• Byte Buddy
• Muzzle
• Extension support
• Service Provider Interface (SPI) for tracer customization
• Installed at runtime through `-javaagent` Java option
OpenTelmetry instrumentation agent - how does it work?
14. Tracing Interceptors
• Standard Consumer / Producer Interceptor
implementations
• Installed through Interceptor configuration
Tracing Interceptors - how do they work?
15. Tracing Wrappers
• Standard Consumer / Producer Interface
implementations
• Installed through code
• Uses java.reflect.Proxy to intercept relevant method
calls and add tracing behaviour
Tracing Wrappers - how do they work?
16. • What is Distributed Tracing (DT)?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• Kafka Connect instrumentation
Section: Kafka Client instrumentation
27. Comparison of instrumentation methods
Interceptor Wrapper Javaagent
Installation Config or Code Code Runtime
Producer Metadata Limited Full Full
Consumer Metadata Full Full Full
Consumer Local Context Not propagated Not propagated Propagated
Library Kafka-clients-2.6 Kafka-clients-2.6 Kafka-clients-0.11
28. • What is Distributed Tracing (DT)?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• Kafka Connect instrumentation
Section: Kafka Client instrumentation
29. Kafka Streams support
• Kafka Streams specific Process span implementation in Javaagent
• Stateless Kafka Streams processing pipelines
• Limitations:
• Non-javaagent implementation limitations
• Stateful operation support
• Caching in stateful operations
46. Kafka Streams support - summary
• Supported as is with Javaagent:
• Stateless
• Stateful - with limitations - single thread context, - no caching
• Wrapping State Store approach
• Inlining Span creation into Stateful operations
• Transformer hack - possible repartitioning
47. • What is Distributed Tracing (DT)?
• What is OpenTelemetry?
• Kafka Client instrumentation
• Kafka Streams instrumentation
• Kafka Connect instrumentation
Section: Kafka Client instrumentation