SlideShare a Scribd company logo
1 of 45
Download to read offline
Writing Blazing Fast, and
Production Ready
Kafka Streams apps (in less than 30 min)
using Azkarra
Kafka Summit Europe 2021
Florian HUSSONNOIS
.
@fhussonnois
Consultant, Trainer Software Engineer
Co-founder @StreamThoughts
Confluent Community Catalyst (2019/2021)
Apache Kafka Streams contributor
Open Source Technology Enthusiastic
- Azkarra Streams
- Kafka Connect File Pulse
- Kafka Streams CEP
- Kafka Client for Kotlin
Hi, Im
Florian Hussonnois
2
3
Like me, you probably started
with the famous Word Count !
KStream<String, String> source = builder.stream("streams-plaintext-input");
source.flatMapValues(splitAndToLowercase())
.groupBy((key, value) -> value)
.count(Materialized.as("counts-store"))
.toStream()
.to("streams-wordcount-output", Produced.with(Serdes.String(), Serdes.Long()));
Topology topology = builder.build();
4
KStream<String, String> source = builder.stream("streams-plaintext-input");
source.flatMapValues(splitAndToLowercase())
.groupBy((key, value) -> value)
.count(Materialized.as("counts-store"))
.toStream()
.to("streams-wordcount-output", Produced.with(Serdes.String(), Serdes.Long()));
Topology topology = builder.build();
GroupBy(Key)
Repartition
Stateful Stream Processing
Consume
Transform
Aggregate / Join
Produce
1 2 3
public class WordCount {
public static void main(String[] args) {
var builder = new StreamsBuilder
();
KStream<String, String> source = builder.stream("streams-plaintext-input"
);
source.flatMapValues(splitAndToLowercase
())
.groupBy((key, value) -> value)
.count(Materialized.as("counts-store"
))
.toStream()
.to("streams-wordcount-output"
, Produced.with(Serdes.String(), Serdes.Long()));
var topology = builder.build();
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG
, "streams-wordcount"
);
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG
, "localhost:9092"
);
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG
, Serdes.String().getClass());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG
, Serdes.String().getClass());
var streams = new KafkaStreams(topology, props);
Runtime.getRuntime().addShutdownHook
(new Thread(streams::close
));
}
}
Core Logic
Execution
5
Configuration
6
Can we deploy a Kafka Streams
application like this one in
production, without any changes?
7
The Answer is No!
8
(Well, unless you are testing your app
in production…cough, cough...)
9
(Well, unless you are testing your app
in production…cough, cough...)
OK, Nobody does that!
▢ Test the app is working as
expected
▢ Externalize configuration
▢ Handle transient errors
▢ Handle deserialization exceptions
Some requirements before
moving into production
Our TODO list
10
▢ Expose the state of the Kafka
Streams application
▢ Be able to monitor offsets and lags
of consumers and state stores
▢ Interactive Queries (optional)
▢ Package the Kafka Streams
application for production
. Business Value vs Effort
Topology
(Business Logic)
Business Value
High
Kafka Streams
Management
IQ
Error Handling
logic
Monitoring /
Health-check
Security
Config
Externalization
Low
Effort
Low/Medium
High
Streams
Lifecycle
Kafka Streams Application
11
RocksDB Offsets and Lags Packaging
.
A lightweight Java framework to make a Kafka Streams application
production-ready in just a few lines of code.
■ Distributed under the Apache License 2.0.
■ Was developed based on experience on a wide range of projects
■ Uses best-practices developed by Kafka users and the open-source community.
Overview:
■ REST API: Health Check, Monitoring, Interactive Queries, etc
■ Embedded WebUI: Topology DAG Visualization
■ Built-in features for handling exceptions and tuning RocksDB
■ Support for Server-Sent Events
Azkarra Framework
in a nutshell
12
#azkarrastreams
.
Available on Maven Central
Azkarra Stream
How to use It ?
13
<dependency>
<groupId>io.streamthoughts
</groupId>
<artifactId>azkarra-streams
</artifactId>
<version>0.9.2</version>
</dependency>
Azkarra Framework:
<dependency>
<groupId>io.streamthoughts
</groupId>
<artifactId>azkarra-commons
</artifactId>
<version>0.9.2</version>
</dependency>
Provides reusable classes for Kafka Streams :
mvn archetype:generate
-DarchetypeGroupId
=io.streamthoughts 
-DarchetypeArtifactId
=azkarra-quickstart-java 
-DarchetypeVersion
=0.9.2 
-DgroupId=azkarra.streams 
-DartifactId=azkarra-getting-started 
-Dversion=1.0 
-Dpackage=azkarra 
-DinteractiveMode
=false
Quick start:
14
Let’s re-write the “Word Count”
using with Azkarra
(we have still 25’ left) 👾
.
.
. Concepts
TopologyProvider
Topology
Provider
Topology
Container for building
and configuring a
Topology
15
class WordCountTopology
implements TopologyProvider, Configurable {
private Conf conf;
@Override
public Topology topology() {
var source = conf.getString("topic.source.name");
var sink = conf.getString("topic.sink.name");
var store = conf.getString("store.name");
var builder = new StreamsBuilder();
builder
.<String, String>stream(source)
.flatMapValues(splitAndToLowercase())
.groupBy((key, value) -> value)
.count(Materialized.as(store))
.toStream()
.to(sink, Produced.with(Serdes.String(), Serdes.Long()));
return builder.build();
}
@Override
public void configure(final Conf conf) { this.conf = conf; }
@Override
public String version() { return "1.0"; }
}
.
.
. Concepts
Execution Environment
StreamsExecution
Environment
Manages the life
cycle of
KafkaStreams
instances. Topology
Provider
Topology
16
// (1) Define the KafkaStreams configuration
var streamsConfig = Conf.of(
BOOTSTRAP_SERVERS_CONFIG, "localhost:9092",
DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass(),
DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass()
);
// (2) Define the Topology configuration
var topologyConfig = Conf.of(
"topic.source.name", "topic-text-lines",
"topic.sink.name", "topic-text-word-count",
"store.name", "Count"
);
// (3) Create and configure a local execution environment
var env = LocalStreamsExecutionEnvironment
.create(Conf.of("streams", streamsConfig))
// (4) Register our topology to run
.registerTopology(
WordCountTopology::new,
Executed.as("WordCount").withConfig(topologyConfig)
);
// (5) Start the environment
env.start();
// (6) Add Shutdown Hook
Runtime.getRuntime()
.addShutdownHook(new Thread(env::stop));
.
17
Let’s start KafkaStreams
Boom! Transient Errors
word-count-1-0-ae1a9bf9-101d-4796-ad36-2e1130e83573-StreamThread-1] Received error code INCOMPLETE_SOURCE_TOPIC_METADATA
16:05:12.585 [word-count-1-0-ae1a9bf9-101d-4796-ad36-2e1130e83573-StreamThread-1] ERROR
org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer
clientId=word-count-1-0-ae1a9bf9-101d-4796-ad36-2e1130e83573-StreamThread-1-consumer, groupId=word-count-1-0] User provided listener
org.apache.kafka.streams.processor.internals.StreamsRebalanceListener failed on invocation of onPartitionsAssigned for partitions []
org.apache.kafka.streams.errors.MissingSourceTopicException: One or more source topics were missing during rebalance
at org.apache.kafka.streams.processor.internals.StreamsRebalanceListener.onPartitionsAssigned(StreamsRebalanceListener.java:57)
~[kafka-streams-2.7.0.jar:?]
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsAssigned(ConsumerCoordinator.java:293) [kafka-clients-2.7.0.jar:?]
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:430) [kafka-clients-2.7.0.jar:?]
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:451) [kafka-clients-2.7.0.jar:?]
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:367) [kafka-clients-2.7.0.jar:?]
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:508) [kafka-clients-2.7.0.jar:?]
.
.
.
18
StreamLifecycleInterceptor
Concepts
Interface StreamsLifecycleInterceptor {
/**
* Intercepts the streams instance before being started.
*/
default void onStart(StreamsLifecycleContext context,
StreamsLifecycleChain chain) {
chain.execute();
}
/**
* Intercepts the streams instance before being stopped.
*/
default void onStop(StreamsLifecycleContext context,
StreamsLifecycleChain chain) {
chain.execute();
}
/**
* Used for logging information.
*/
default String name() {
return getClass().getSimpleName();
}
}
A pluggable interface that allows intercepting a
KafkaStreams instance before being started or
stopped.
Built-in Implementations:
■ AutoCreateTopicsInterceptor
■ WaitForSourceTopicsInterceptor
■ KafkaBrokerReadyInterceptor
...and a few more (discussed later) 😉
Most Interceptors are configurable.
.
.
.
19
AutoCreateTopicsInterceptor
Concepts import static io.s.a.r.i.AutoCreateTopicsInterceptorConfig.*;
// (1) Define the KafkaStreams configuration
var streamsConfig = ...
// (2) Define the Topology configuration
var topologyConfig = ...
// (3) Define the Environment configuration
var envConfig = Conf.of(
"streams", streamsConfig,
AUTO_CREATE_TOPICS_NUM_PARTITIONS_CONFIG, 2,
AUTO_CREATE_TOPICS_REPLICATION_FACTOR_CONFIG, 1,
// WARN - ONLY DURING DEVELOPMENT
AUTO_DELETE_TOPICS_ENABLE_CONFIG, true
);
// (4) Create and configure the local execution environment
LocalStreamsExecutionEnvironment
.create(envConfig)
// (5) Add the StreamLifecycleInterceptor
.addStreamsLifecycleInterceptor(
AutoCreateTopicsInterceptor::new
)
// ...code omitted for clarity
Automatically infers the source and sink topics to
be created from the Topology.describe().
■ Internally, uses the AdminClient API.
■ Can be used during development for deleting
all topics when the instance is stopped.
for
▢ Test the app is working as
expected
▢ Externalize configuration
▢ Handle transient errors
▢ Handle deserialization exceptions
Externalizing configuration
(we have 20’ left)😀
What's left to do ?
20
▢ Expose the state of the Kafka
Streams application
▢ Be able to monitor offsets and lags
of consumers and state stores
▢ Interactive Queries (optional)
▢ Package the Kafka Streams
application for production
.
.
.
21
Conf & AzkarraConf
External Configuration
// file:application.conf
azkarra {
// The configuration settings passed to the Kafka Streams
// instance should be prefixed with `.streams`
streams {
bootstrap.servers = "localhost:9092"
default.key.serde = "org.apache.kafka..Serdes$StringSerde"
default.value.serde = "org.apache.kafka..Serdes$StringSerde"
}
topic.source.name = "topic-text-lines"
topic.sink.name = "topic-text-word-count"
store.name = "Count"
auto.create.topics.num.partitions = 2
auto.create.topics.replication.factor = 1
auto.delete.topics.enable = true
}
// file:Main.class
var config = AzkarraConf.create().getSubConf("azkarra");
Azkarra provides the Configurable interface which
can be implemented by most of the Azkarra
components.
■ AzkarraConf: Uses the Lightbend Config library.
○ Allows loading configuration settings from
HOCON files.
void configure(final Conf configuration);
.
.
. Concepts
AzkarraContext
AzkarraContext
StreamsExecution
Environment
Container for
Dependency Injection.
Used to automatically
configures
streams execution
environments.
Topology
Provider
Topology
22
public static void main(final String[] args) {
// (1) Load the configuration (application.conf)
var config = AzkarraConf.create().getSubConf("azkarra");
// (2) Create the Azkarra Context
var context = DefaultAzkarraContext.create(config);
// (3) Register StreamLifecycleInterceptor as component
context.registerComponent(
ConsoleStreamsLifecycleInterceptor.class
);
// (4) Register the Topology to the default environment
context.addTopology(
WordCountTopology.class,
Executed.as("word-count")
);
// (5) Start the context
context
.setRegisterShutdownHook(true)
.start();
}
.
.
. Concepts
AzkarraApplication
AzkarraContext
AzkarraApplication
StreamsExecution
Environment
Used to bootstrap and
configure an Azkarra
application.
Provides Embedded
HTTP-Server
Provides
Component
Scanning
Topology
Provider
Topology
23
public class WordCount {
public static void main(final String[] args) {
// (1) Load the configuration (application.conf)
var config = AzkarraConf.create();
// (2) Create the Azkarra Context
var context = DefaultAzkarraContext.create();
// (3) Register the Topology to the default environment
context.addTopology(
WordCountTopology.class,
Executed.as("word-count")
);
// (4) Create Azkarra application
new AzkarraApplication()
.setContext(context)
.setConfiguration(config)
// (5) Enable and configure embedded HTTP server
.setHttpServerEnable(true)
.setHttpServerConf(ServerConfig.newBuilder()
.setListener("localhost")
.setPort(8080)
.build()
)
// (6) Start Azkarra
.run(args);
}
}
.
.
. Concepts
AzkarraApplication
AzkarraContext
AzkarraApplication
StreamsExecution
Environment
Topology
Provider
Topology
24
@AzkarraStreamsApplication
public class WordCount {
public static void main(String[] args) {
AzkarraApplication.run(WordCount.class, args);
}
@Component
public static class WordCountTopology implements
TopologyProvider, Configurable {
private Conf conf;
@Override
public Topology topology() {
var builder = new StreamsBuilder();
// ...code omitted for clarity
return builder.build();
}
@Override
public void configure(Conf conf) {
this.conf = conf;
}
@Override
public String version() { return "1.0"; }
}
}
Used to bootstrap and
configure an Azkarra
application.
Provides Embedded
HTTP-Server
Provides
Component
Scanning
▢ Test the app is working as
expected
▢ Externalize configuration
▢ Handle transient errors
▢ Handle deserialization exceptions
Handling Deserialization Exceptions
(we have 15’ left)🤔
What's left to do ?
25
▢ Expose the state of the Kafka
Streams application
▢ Be able to monitor offsets and lags
of consumers and state stores
▢ Interactive Queries (optional)
▢ Package the Kafka Streams
application for production
.
default.deserialization.exception.handler
■ CONTINUE: continue with processing
■ FAIL: fail the processing and stop
Two available implementations :
■ LogAndContinueExceptionHandler
■ LogAndFailExceptionHandler
26
Solution #1
Built-in mechanisms
Not really suitable for production.
Cannot monitor efficiently
corrupted messages
.
.
.
27
Solution #2
Dead Letter Queue Topic
Solution #3
Sentinel Value
DeserializationExceptionHandler
Send corrupted messages to a
special topic.
Deserializer<T>
Catch any exception thrown during deserialization
and return a default value (e.g: null, “N/A”, etc).
Handler
?
Source Topic
Topology
(skip)
Dead Letter Topic
! !
! !
Source Topic SafeDeserializer
Delegate
Deserializer
(null)(null)
! !
.
.
.
28
Solution #2
Using Azkarra
Solution #3
DeadLetterTopicExceptionHandler
■ By default, sends corrupted records to
<Topic>-rejected
■ Doesn’t change the schema/format of the
corrupted message.
■ Use Kafka Headers to trace exception cause and
origin, e.g. :
○ __errors.exception.stacktrace
__errors.exception.message
○ __errors.exception.class.name
○ __errors.timestamp
○ __errors.application.id
○ __errors.record.[topic|partition|offset]
■ Can be configured to send records to a distinct
Kafka Cluster than the one used for KafkaStreams.
SafeSerdes
SafeSerdes.Long(-1L);
SafeSerdes.UUID(null);
SafeSerdes.serdeFrom(
new JsonSerializer (),
new JsonDeserializer (),
NullNode.getInstance ()
);
▢ Test the app is working as
expected
▢ Externalize configuration
▢ Handle transient errors
▢ Handle deserialization exceptions
Monitoring
(we have 10’ left)🙃
Our TODO list
29
▢ Expose the state of the Kafka
Streams application
▢ Be able to monitor offsets and lags
of consumers and state stores
▢ Interactive Queries (optional)
▢ Package the Kafka Streams
application for production
.
The Kafka Streams API provides few methods for monitoring the state of the running instance.
■ KafkaStreams#state(), KafkaStreams#setStateListener()
⎼ CREATED, REBALANCING, RUNNING, PENDING_SHUTDOWN, NOT_RUNNING, ERROR
⎼ can be used for checking the Liveness and Readiness for the instance.
■ KafkaStreams#localThreadsMetadata
⎼ returns information about local Threads/Tasks and partition assignments.
■ KafkaStreams#metrics()
Best Practices:
■ Build some REST APIs to expose the states of Kafka Streams
■ Export Metrics using JMX, Prometheus, etc
30
How to monitor
Kafka Streams ?
.
31
Kafka Consumer Lag and Offsets
Maybe the most fundamental indicator to monitor
Consumer
KafkaStreams#allLocalStorePartitionLags()
KafkaStreams#setGlobalStateRestoreListener
■ NOTE: Internal KafkaStreams Threads do not
start consuming messages until stores are
recovered.
public interface ConsumerInterceptor <K, V> extends Configurable ,
AutoCloseable {
ConsumerRecords <K, V> onConsume (ConsumerRecords <K, V> record);
void onCommit (Map<TopicPartition , OffsetAndMetadata > offsets);
void close();
}
KafkaStreams
Configured using :
main.consumer.interceptor.classes
How far behind the Kafka Streams consumers
are from the producers ?
Is the Kafka Streams application ready to process
records and can serve interactive queries ?
.
Azkarra supports a REST API for managing,
monitoring and querying Kafka Streams instances.
■ Provides support for Interactive Queries
■ Built-in authentication and authorization
mechanisms (Basic Auth, SSL 2-Way).
■ Allows registration of new JAX-RS resources
using plugin interface: AzkarraRestExtension
32
Azkarra
REST API ● Get information about the local streams instance
GET /api/v1/streams
● Get the status for the streams instance
GET /api/v1/streams/(string: id)/status
● Get the configuration for the streams instance
GET /api/v1/streams/(string: id)/config
● Get current metrics for the streams instance
GET /api/v1/streams/(string: applicationId)/metrics
● Get all metrics in Prometheus format
GET /prometheus
Micrometer Prometheus
.
.
.
Azkarra can be configured for periodically reporting
the internal states of a KafkaStreams instance.
■ Use StreamLifecycleInterceptor:
⎼ MonitoringStreamsInterceptor
■ Accepts a pluggable reporter class
⎼ Default : KafkaMonitoringReporter
⎼ Publishes events that adhere to the
CloudEvents specification.
33
Putting it all together
Exporting Kafka Streams
States Anywhere
{
"id":
"appid:word-count;appsrv:localhost:8080;ts:1620691200000",
"source": "azkarra/ks/localhost:8080",
"specversion": "1.0",
"type": "io.streamthoughts.azkarra.streams.stateupdateevent",
"time": "2021-05-11T00:00:00.000+0000",
"datacontenttype": "application/json",
"ioazkarramonitorintervalms": 10000,
"ioazkarrastreamsappid": "word-count",
"ioazkarraversion": "0.9.2",
"ioazkarrastreamsappserver": "localhost:8080",
"data": {
"state": "RUNNING",
"threads": [
{
"name": "word-count-...-93e9a84057ad-StreamThread-1",
"state": "RUNNING",
"active_tasks": [],
"standby_tasks": [],
"clients": {}
}
],
"offsets": {
"group": "",
"consumers": []
},
"stores": {
"partitionRestoreInfos": [],
"partitionLagInfos": []
},
"state_changed_time": 1620691200000
}
}
Cloud Events
▢ Test the app is working as
expected
▢ Externalize configuration
▢ Handle transient errors
▢ Handle deserialization exceptions
Packaging
(we have still 5’ left) 😬
Our TODO list
34
▢ Expose the state of the Kafka
Streams application
▢ Be able to monitor offsets and lags
of consumers and state stores
▢ Interactive Queries (optional)
▢ Package the Kafka Streams
application for production
.
Azkarra-based applications can be packaged as any other Kafka Streams apps.
Azkarra Worker → An empty Azkarra application
■ Topologies and components can be loaded from an external uber-jar
⎼ Similar to Kafka Connect plugins and connectors
■ Can be used as the base image for Docker
⎼ Use Jib to build optimized Docker images for Java
35
Packaging Kafka Streams
with Azkarra
$ docker run --net host streamthoughts/azkarra-streams-worker:latest 
-v ./application.conf=/etc/azkarra/azkarra.conf 
-v ./local-topologies=/usr/share/azkarra-components/ 
streamthoughts/azkarra-streams-worker
Jib + Docker + Azkarra = ❤
.
Using Kubernetes, topologies can be downloaded and mount using an init-container.
36
Deploying Kafka Streams
with Azkarra (in Kubernetes)
Deployment, StatefulSet, or...
Container
(image: azkarra-worker)
InitContainer
my-topology-with-dependencies-1.0.jar
HTTP GET /
Repository Manager
e.g., Nexus / Artifactory
Shared volume
/var/lib/components/
azkarra.component.paths
▢ Test the app is working as
expected
▢ Externalize configuration
▢ Handle transient errors
▢ Handle deserialization exceptions
In less than 30 min
using Azkarra🚀
DONE
37
▢ Expose the state of the Kafka
Streams application
▢ Be able to monitor offsets and lags
of consumers and state stores
▢ Interactive Queries (optional)
▢ Package the Kafka Streams
application for production
38
Demo
(new coins...we have still 5’ left)🤫
.
Kafka Streams is a very good choice to quickly create streaming applications.
But, building applications for production can be a lot of work.
Azkarra aims to be a fast path for production by providing all the cool features you need:
■ Built-in mechanisms for handling exceptions
■ Built-in REST API for executing Interactive Queries.
■ Consumers Offsets Lag
■ Topology Visualization
■ Dashboard UI
Take Aways
Conclusion
39
.
■ Add support for querying stale stores.
■ Add support for deploying and managing Kafka Streams
topologies directly into Kubernetes
❏ i.e., KubStreamsExecutionEnvironment
■ Enhance the WebUI to add some visualizations for the key
metrics to monitor.
Take Aways
Roadmap
40
.
Official Website: https://www.azkarrastreams.io/
GitHub: https://github.com/streamthoughts/azkarra-streams (for contributing and adding⭐)
Slack: https://communityinviter.com/apps/azkarra-streams/azkarra-streams-community
Demo: https://github.com/streamthoughts/demo-kafka-streams-scottify
Take Aways
Links
41
Join us on Slack!
Thank you
@fhussonnois
Florian HUSSONNOIS ▪ florian@streamthoughts.io
.
43
Azkarra
Dashboard
.
44
Azkarra
Dashboard
.
Images
■ Photo by Mark König on Unsplash
■ Photo by CHUTTERSNAP on Unsplash
45
Images & Icons

More Related Content

What's hot

Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringDatabricks
 
Flink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaFlink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaDataWorks Summit
 
Developing real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and KafkaDeveloping real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and Kafkamarius_bogoevici
 
RDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsRDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsJean-Paul Calbimonte
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...HostedbyConfluent
 
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...Flink Forward
 
Resilience4j with Spring Boot
Resilience4j with Spring BootResilience4j with Spring Boot
Resilience4j with Spring BootKnoldus Inc.
 
The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyJosh Baer
 
Apache Kafka® Security Overview
Apache Kafka® Security OverviewApache Kafka® Security Overview
Apache Kafka® Security Overviewconfluent
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsDatabricks
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Timothy Spann
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafkaconfluent
 
Developing an Akka Edge6
Developing an Akka Edge6Developing an Akka Edge6
Developing an Akka Edge6saaaaaaki
 
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...HostedbyConfluent
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafkaconfluent
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit
 

What's hot (19)

Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
 
Flink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at AlibabaFlink SQL & TableAPI in Large Scale Production at Alibaba
Flink SQL & TableAPI in Large Scale Production at Alibaba
 
Developing real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and KafkaDeveloping real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and Kafka
 
RDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsRDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementations
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
Mind the App: How to Monitor Your Kafka Streams Applications | Bruno Cadonna,...
 
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...Running Flink in Production:  The good, The bad and The in Between - Lakshmi ...
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
 
Resilience4j with Spring Boot
Resilience4j with Spring BootResilience4j with Spring Boot
Resilience4j with Spring Boot
 
The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at Spotify
 
Apache Kafka® Security Overview
Apache Kafka® Security OverviewApache Kafka® Security Overview
Apache Kafka® Security Overview
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)Hello, kafka! (an introduction to apache kafka)
Hello, kafka! (an introduction to apache kafka)
 
Reliability Guarantees for Apache Kafka
Reliability Guarantees for Apache KafkaReliability Guarantees for Apache Kafka
Reliability Guarantees for Apache Kafka
 
Developing an Akka Edge6
Developing an Akka Edge6Developing an Akka Edge6
Developing an Akka Edge6
 
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
 
KSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache KafkaKSQL: Open Source Streaming for Apache Kafka
KSQL: Open Source Streaming for Apache Kafka
 
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great TasteIn-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
 

Similar to Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30 min using Azkarra | Florian Hussonnois, StreamThoughts

Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Guido Schmutz
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Kai Wähner
 
Kafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platformKafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platformPaolo Castagna
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaGuido Schmutz
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaJoe Stein
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraJoe Stein
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaGuido Schmutz
 
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaExploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaLightbend
 
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Lightbend
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientistsJenn Rawlins
 
Productionalizing spark streaming applications
Productionalizing spark streaming applicationsProductionalizing spark streaming applications
Productionalizing spark streaming applicationsRobert Sanders
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps_Fest
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kai Wähner
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Apache Kafka - A Distributed Streaming Platform
Apache Kafka - A Distributed Streaming PlatformApache Kafka - A Distributed Streaming Platform
Apache Kafka - A Distributed Streaming PlatformPaolo Castagna
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformconfluent
 
Asynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka StreamsAsynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka StreamsJohan Andrén
 
Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterconfluent
 

Similar to Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30 min using Azkarra | Florian Hussonnois, StreamThoughts (20)

Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!Apache Kafka - Scalable Message Processing and more!
Apache Kafka - Scalable Message Processing and more!
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
Training
TrainingTraining
Training
 
Kafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platformKafka streams - From pub/sub to a complete stream processing platform
Kafka streams - From pub/sub to a complete stream processing platform
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Kafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around KafkaKafka Connect & Streams - the ecosystem around Kafka
Kafka Connect & Streams - the ecosystem around Kafka
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and CassandraReal-Time Log Analysis with Apache Mesos, Kafka and Cassandra
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
 
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaExploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
 
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
Build Real-Time Streaming ETL Pipelines With Akka Streams, Alpakka And Apache...
 
Kafka for data scientists
Kafka for data scientistsKafka for data scientists
Kafka for data scientists
 
Productionalizing spark streaming applications
Productionalizing spark streaming applicationsProductionalizing spark streaming applications
Productionalizing spark streaming applications
 
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Apache Kafka - A Distributed Streaming Platform
Apache Kafka - A Distributed Streaming PlatformApache Kafka - A Distributed Streaming Platform
Apache Kafka - A Distributed Streaming Platform
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platform
 
Asynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka StreamsAsynchronous stream processing with Akka Streams
Asynchronous stream processing with Akka Streams
 
Introduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matterIntroduction to Apache Kafka and Confluent... and why they matter
Introduction to Apache Kafka and Confluent... and why they matter
 

More from HostedbyConfluent

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonHostedbyConfluent
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolHostedbyConfluent
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesHostedbyConfluent
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaHostedbyConfluent
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonHostedbyConfluent
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonHostedbyConfluent
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyHostedbyConfluent
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...HostedbyConfluent
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...HostedbyConfluent
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersHostedbyConfluent
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformHostedbyConfluent
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubHostedbyConfluent
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonHostedbyConfluent
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLHostedbyConfluent
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceHostedbyConfluent
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondHostedbyConfluent
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsHostedbyConfluent
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemHostedbyConfluent
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksHostedbyConfluent
 

More from HostedbyConfluent (20)

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Renaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit LondonRenaming a Kafka Topic | Kafka Summit London
Renaming a Kafka Topic | Kafka Summit London
 
Evolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at TrendyolEvolution of NRT Data Ingestion Pipeline at Trendyol
Evolution of NRT Data Ingestion Pipeline at Trendyol
 
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking TechniquesEnsuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
 
Exactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and KafkaExactly-once Stream Processing with Arroyo and Kafka
Exactly-once Stream Processing with Arroyo and Kafka
 
Fish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit LondonFish Plays Pokemon | Kafka Summit London
Fish Plays Pokemon | Kafka Summit London
 
Tiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit LondonTiered Storage 101 | Kafla Summit London
Tiered Storage 101 | Kafla Summit London
 
Building a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And WhyBuilding a Self-Service Stream Processing Portal: How And Why
Building a Self-Service Stream Processing Portal: How And Why
 
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
 
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
 
Navigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka ClustersNavigating Private Network Connectivity Options for Kafka Clusters
Navigating Private Network Connectivity Options for Kafka Clusters
 
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data PlatformApache Flink: Building a Company-wide Self-service Streaming Data Platform
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
 
Explaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy PubExplaining How Real-Time GenAI Works in a Noisy Pub
Explaining How Real-Time GenAI Works in a Noisy Pub
 
TL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit LondonTL;DR Kafka Metrics | Kafka Summit London
TL;DR Kafka Metrics | Kafka Summit London
 
A Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSLA Window Into Your Kafka Streams Tasks | KSL
A Window Into Your Kafka Streams Tasks | KSL
 
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing PerformanceMastering Kafka Producer Configs: A Guide to Optimizing Performance
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
 
Data Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and BeyondData Contracts Management: Schema Registry and Beyond
Data Contracts Management: Schema Registry and Beyond
 
Code-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink AppsCode-First Approach: Crafting Efficient Flink Apps
Code-First Approach: Crafting Efficient Flink Apps
 
Debezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC EcosystemDebezium vs. the World: An Overview of the CDC Ecosystem
Debezium vs. the World: An Overview of the CDC Ecosystem
 
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local DisksBeyond Tiered Storage: Serverless Kafka with No Local Disks
Beyond Tiered Storage: Serverless Kafka with No Local Disks
 

Recently uploaded

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

Writing Blazing Fast, and Production-Ready Kafka Streams apps in less than 30 min using Azkarra | Florian Hussonnois, StreamThoughts

  • 1. Writing Blazing Fast, and Production Ready Kafka Streams apps (in less than 30 min) using Azkarra Kafka Summit Europe 2021 Florian HUSSONNOIS
  • 2. . @fhussonnois Consultant, Trainer Software Engineer Co-founder @StreamThoughts Confluent Community Catalyst (2019/2021) Apache Kafka Streams contributor Open Source Technology Enthusiastic - Azkarra Streams - Kafka Connect File Pulse - Kafka Streams CEP - Kafka Client for Kotlin Hi, Im Florian Hussonnois 2
  • 3. 3 Like me, you probably started with the famous Word Count ! KStream<String, String> source = builder.stream("streams-plaintext-input"); source.flatMapValues(splitAndToLowercase()) .groupBy((key, value) -> value) .count(Materialized.as("counts-store")) .toStream() .to("streams-wordcount-output", Produced.with(Serdes.String(), Serdes.Long())); Topology topology = builder.build();
  • 4. 4 KStream<String, String> source = builder.stream("streams-plaintext-input"); source.flatMapValues(splitAndToLowercase()) .groupBy((key, value) -> value) .count(Materialized.as("counts-store")) .toStream() .to("streams-wordcount-output", Produced.with(Serdes.String(), Serdes.Long())); Topology topology = builder.build(); GroupBy(Key) Repartition Stateful Stream Processing Consume Transform Aggregate / Join Produce 1 2 3
  • 5. public class WordCount { public static void main(String[] args) { var builder = new StreamsBuilder (); KStream<String, String> source = builder.stream("streams-plaintext-input" ); source.flatMapValues(splitAndToLowercase ()) .groupBy((key, value) -> value) .count(Materialized.as("counts-store" )) .toStream() .to("streams-wordcount-output" , Produced.with(Serdes.String(), Serdes.Long())); var topology = builder.build(); Properties props = new Properties(); props.put(StreamsConfig.APPLICATION_ID_CONFIG , "streams-wordcount" ); props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG , "localhost:9092" ); props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG , Serdes.String().getClass()); props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG , Serdes.String().getClass()); var streams = new KafkaStreams(topology, props); Runtime.getRuntime().addShutdownHook (new Thread(streams::close )); } } Core Logic Execution 5 Configuration
  • 6. 6 Can we deploy a Kafka Streams application like this one in production, without any changes?
  • 8. 8 (Well, unless you are testing your app in production…cough, cough...)
  • 9. 9 (Well, unless you are testing your app in production…cough, cough...) OK, Nobody does that!
  • 10. ▢ Test the app is working as expected ▢ Externalize configuration ▢ Handle transient errors ▢ Handle deserialization exceptions Some requirements before moving into production Our TODO list 10 ▢ Expose the state of the Kafka Streams application ▢ Be able to monitor offsets and lags of consumers and state stores ▢ Interactive Queries (optional) ▢ Package the Kafka Streams application for production
  • 11. . Business Value vs Effort Topology (Business Logic) Business Value High Kafka Streams Management IQ Error Handling logic Monitoring / Health-check Security Config Externalization Low Effort Low/Medium High Streams Lifecycle Kafka Streams Application 11 RocksDB Offsets and Lags Packaging
  • 12. . A lightweight Java framework to make a Kafka Streams application production-ready in just a few lines of code. ■ Distributed under the Apache License 2.0. ■ Was developed based on experience on a wide range of projects ■ Uses best-practices developed by Kafka users and the open-source community. Overview: ■ REST API: Health Check, Monitoring, Interactive Queries, etc ■ Embedded WebUI: Topology DAG Visualization ■ Built-in features for handling exceptions and tuning RocksDB ■ Support for Server-Sent Events Azkarra Framework in a nutshell 12 #azkarrastreams
  • 13. . Available on Maven Central Azkarra Stream How to use It ? 13 <dependency> <groupId>io.streamthoughts </groupId> <artifactId>azkarra-streams </artifactId> <version>0.9.2</version> </dependency> Azkarra Framework: <dependency> <groupId>io.streamthoughts </groupId> <artifactId>azkarra-commons </artifactId> <version>0.9.2</version> </dependency> Provides reusable classes for Kafka Streams : mvn archetype:generate -DarchetypeGroupId =io.streamthoughts -DarchetypeArtifactId =azkarra-quickstart-java -DarchetypeVersion =0.9.2 -DgroupId=azkarra.streams -DartifactId=azkarra-getting-started -Dversion=1.0 -Dpackage=azkarra -DinteractiveMode =false Quick start:
  • 14. 14 Let’s re-write the “Word Count” using with Azkarra (we have still 25’ left) 👾
  • 15. . . . Concepts TopologyProvider Topology Provider Topology Container for building and configuring a Topology 15 class WordCountTopology implements TopologyProvider, Configurable { private Conf conf; @Override public Topology topology() { var source = conf.getString("topic.source.name"); var sink = conf.getString("topic.sink.name"); var store = conf.getString("store.name"); var builder = new StreamsBuilder(); builder .<String, String>stream(source) .flatMapValues(splitAndToLowercase()) .groupBy((key, value) -> value) .count(Materialized.as(store)) .toStream() .to(sink, Produced.with(Serdes.String(), Serdes.Long())); return builder.build(); } @Override public void configure(final Conf conf) { this.conf = conf; } @Override public String version() { return "1.0"; } }
  • 16. . . . Concepts Execution Environment StreamsExecution Environment Manages the life cycle of KafkaStreams instances. Topology Provider Topology 16 // (1) Define the KafkaStreams configuration var streamsConfig = Conf.of( BOOTSTRAP_SERVERS_CONFIG, "localhost:9092", DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass(), DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass() ); // (2) Define the Topology configuration var topologyConfig = Conf.of( "topic.source.name", "topic-text-lines", "topic.sink.name", "topic-text-word-count", "store.name", "Count" ); // (3) Create and configure a local execution environment var env = LocalStreamsExecutionEnvironment .create(Conf.of("streams", streamsConfig)) // (4) Register our topology to run .registerTopology( WordCountTopology::new, Executed.as("WordCount").withConfig(topologyConfig) ); // (5) Start the environment env.start(); // (6) Add Shutdown Hook Runtime.getRuntime() .addShutdownHook(new Thread(env::stop));
  • 17. . 17 Let’s start KafkaStreams Boom! Transient Errors word-count-1-0-ae1a9bf9-101d-4796-ad36-2e1130e83573-StreamThread-1] Received error code INCOMPLETE_SOURCE_TOPIC_METADATA 16:05:12.585 [word-count-1-0-ae1a9bf9-101d-4796-ad36-2e1130e83573-StreamThread-1] ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator - [Consumer clientId=word-count-1-0-ae1a9bf9-101d-4796-ad36-2e1130e83573-StreamThread-1-consumer, groupId=word-count-1-0] User provided listener org.apache.kafka.streams.processor.internals.StreamsRebalanceListener failed on invocation of onPartitionsAssigned for partitions [] org.apache.kafka.streams.errors.MissingSourceTopicException: One or more source topics were missing during rebalance at org.apache.kafka.streams.processor.internals.StreamsRebalanceListener.onPartitionsAssigned(StreamsRebalanceListener.java:57) ~[kafka-streams-2.7.0.jar:?] at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsAssigned(ConsumerCoordinator.java:293) [kafka-clients-2.7.0.jar:?] at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:430) [kafka-clients-2.7.0.jar:?] at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:451) [kafka-clients-2.7.0.jar:?] at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:367) [kafka-clients-2.7.0.jar:?] at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:508) [kafka-clients-2.7.0.jar:?]
  • 18. . . . 18 StreamLifecycleInterceptor Concepts Interface StreamsLifecycleInterceptor { /** * Intercepts the streams instance before being started. */ default void onStart(StreamsLifecycleContext context, StreamsLifecycleChain chain) { chain.execute(); } /** * Intercepts the streams instance before being stopped. */ default void onStop(StreamsLifecycleContext context, StreamsLifecycleChain chain) { chain.execute(); } /** * Used for logging information. */ default String name() { return getClass().getSimpleName(); } } A pluggable interface that allows intercepting a KafkaStreams instance before being started or stopped. Built-in Implementations: ■ AutoCreateTopicsInterceptor ■ WaitForSourceTopicsInterceptor ■ KafkaBrokerReadyInterceptor ...and a few more (discussed later) 😉 Most Interceptors are configurable.
  • 19. . . . 19 AutoCreateTopicsInterceptor Concepts import static io.s.a.r.i.AutoCreateTopicsInterceptorConfig.*; // (1) Define the KafkaStreams configuration var streamsConfig = ... // (2) Define the Topology configuration var topologyConfig = ... // (3) Define the Environment configuration var envConfig = Conf.of( "streams", streamsConfig, AUTO_CREATE_TOPICS_NUM_PARTITIONS_CONFIG, 2, AUTO_CREATE_TOPICS_REPLICATION_FACTOR_CONFIG, 1, // WARN - ONLY DURING DEVELOPMENT AUTO_DELETE_TOPICS_ENABLE_CONFIG, true ); // (4) Create and configure the local execution environment LocalStreamsExecutionEnvironment .create(envConfig) // (5) Add the StreamLifecycleInterceptor .addStreamsLifecycleInterceptor( AutoCreateTopicsInterceptor::new ) // ...code omitted for clarity Automatically infers the source and sink topics to be created from the Topology.describe(). ■ Internally, uses the AdminClient API. ■ Can be used during development for deleting all topics when the instance is stopped. for
  • 20. ▢ Test the app is working as expected ▢ Externalize configuration ▢ Handle transient errors ▢ Handle deserialization exceptions Externalizing configuration (we have 20’ left)😀 What's left to do ? 20 ▢ Expose the state of the Kafka Streams application ▢ Be able to monitor offsets and lags of consumers and state stores ▢ Interactive Queries (optional) ▢ Package the Kafka Streams application for production
  • 21. . . . 21 Conf & AzkarraConf External Configuration // file:application.conf azkarra { // The configuration settings passed to the Kafka Streams // instance should be prefixed with `.streams` streams { bootstrap.servers = "localhost:9092" default.key.serde = "org.apache.kafka..Serdes$StringSerde" default.value.serde = "org.apache.kafka..Serdes$StringSerde" } topic.source.name = "topic-text-lines" topic.sink.name = "topic-text-word-count" store.name = "Count" auto.create.topics.num.partitions = 2 auto.create.topics.replication.factor = 1 auto.delete.topics.enable = true } // file:Main.class var config = AzkarraConf.create().getSubConf("azkarra"); Azkarra provides the Configurable interface which can be implemented by most of the Azkarra components. ■ AzkarraConf: Uses the Lightbend Config library. ○ Allows loading configuration settings from HOCON files. void configure(final Conf configuration);
  • 22. . . . Concepts AzkarraContext AzkarraContext StreamsExecution Environment Container for Dependency Injection. Used to automatically configures streams execution environments. Topology Provider Topology 22 public static void main(final String[] args) { // (1) Load the configuration (application.conf) var config = AzkarraConf.create().getSubConf("azkarra"); // (2) Create the Azkarra Context var context = DefaultAzkarraContext.create(config); // (3) Register StreamLifecycleInterceptor as component context.registerComponent( ConsoleStreamsLifecycleInterceptor.class ); // (4) Register the Topology to the default environment context.addTopology( WordCountTopology.class, Executed.as("word-count") ); // (5) Start the context context .setRegisterShutdownHook(true) .start(); }
  • 23. . . . Concepts AzkarraApplication AzkarraContext AzkarraApplication StreamsExecution Environment Used to bootstrap and configure an Azkarra application. Provides Embedded HTTP-Server Provides Component Scanning Topology Provider Topology 23 public class WordCount { public static void main(final String[] args) { // (1) Load the configuration (application.conf) var config = AzkarraConf.create(); // (2) Create the Azkarra Context var context = DefaultAzkarraContext.create(); // (3) Register the Topology to the default environment context.addTopology( WordCountTopology.class, Executed.as("word-count") ); // (4) Create Azkarra application new AzkarraApplication() .setContext(context) .setConfiguration(config) // (5) Enable and configure embedded HTTP server .setHttpServerEnable(true) .setHttpServerConf(ServerConfig.newBuilder() .setListener("localhost") .setPort(8080) .build() ) // (6) Start Azkarra .run(args); } }
  • 24. . . . Concepts AzkarraApplication AzkarraContext AzkarraApplication StreamsExecution Environment Topology Provider Topology 24 @AzkarraStreamsApplication public class WordCount { public static void main(String[] args) { AzkarraApplication.run(WordCount.class, args); } @Component public static class WordCountTopology implements TopologyProvider, Configurable { private Conf conf; @Override public Topology topology() { var builder = new StreamsBuilder(); // ...code omitted for clarity return builder.build(); } @Override public void configure(Conf conf) { this.conf = conf; } @Override public String version() { return "1.0"; } } } Used to bootstrap and configure an Azkarra application. Provides Embedded HTTP-Server Provides Component Scanning
  • 25. ▢ Test the app is working as expected ▢ Externalize configuration ▢ Handle transient errors ▢ Handle deserialization exceptions Handling Deserialization Exceptions (we have 15’ left)🤔 What's left to do ? 25 ▢ Expose the state of the Kafka Streams application ▢ Be able to monitor offsets and lags of consumers and state stores ▢ Interactive Queries (optional) ▢ Package the Kafka Streams application for production
  • 26. . default.deserialization.exception.handler ■ CONTINUE: continue with processing ■ FAIL: fail the processing and stop Two available implementations : ■ LogAndContinueExceptionHandler ■ LogAndFailExceptionHandler 26 Solution #1 Built-in mechanisms Not really suitable for production. Cannot monitor efficiently corrupted messages
  • 27. . . . 27 Solution #2 Dead Letter Queue Topic Solution #3 Sentinel Value DeserializationExceptionHandler Send corrupted messages to a special topic. Deserializer<T> Catch any exception thrown during deserialization and return a default value (e.g: null, “N/A”, etc). Handler ? Source Topic Topology (skip) Dead Letter Topic ! ! ! ! Source Topic SafeDeserializer Delegate Deserializer (null)(null) ! !
  • 28. . . . 28 Solution #2 Using Azkarra Solution #3 DeadLetterTopicExceptionHandler ■ By default, sends corrupted records to <Topic>-rejected ■ Doesn’t change the schema/format of the corrupted message. ■ Use Kafka Headers to trace exception cause and origin, e.g. : ○ __errors.exception.stacktrace __errors.exception.message ○ __errors.exception.class.name ○ __errors.timestamp ○ __errors.application.id ○ __errors.record.[topic|partition|offset] ■ Can be configured to send records to a distinct Kafka Cluster than the one used for KafkaStreams. SafeSerdes SafeSerdes.Long(-1L); SafeSerdes.UUID(null); SafeSerdes.serdeFrom( new JsonSerializer (), new JsonDeserializer (), NullNode.getInstance () );
  • 29. ▢ Test the app is working as expected ▢ Externalize configuration ▢ Handle transient errors ▢ Handle deserialization exceptions Monitoring (we have 10’ left)🙃 Our TODO list 29 ▢ Expose the state of the Kafka Streams application ▢ Be able to monitor offsets and lags of consumers and state stores ▢ Interactive Queries (optional) ▢ Package the Kafka Streams application for production
  • 30. . The Kafka Streams API provides few methods for monitoring the state of the running instance. ■ KafkaStreams#state(), KafkaStreams#setStateListener() ⎼ CREATED, REBALANCING, RUNNING, PENDING_SHUTDOWN, NOT_RUNNING, ERROR ⎼ can be used for checking the Liveness and Readiness for the instance. ■ KafkaStreams#localThreadsMetadata ⎼ returns information about local Threads/Tasks and partition assignments. ■ KafkaStreams#metrics() Best Practices: ■ Build some REST APIs to expose the states of Kafka Streams ■ Export Metrics using JMX, Prometheus, etc 30 How to monitor Kafka Streams ?
  • 31. . 31 Kafka Consumer Lag and Offsets Maybe the most fundamental indicator to monitor Consumer KafkaStreams#allLocalStorePartitionLags() KafkaStreams#setGlobalStateRestoreListener ■ NOTE: Internal KafkaStreams Threads do not start consuming messages until stores are recovered. public interface ConsumerInterceptor <K, V> extends Configurable , AutoCloseable { ConsumerRecords <K, V> onConsume (ConsumerRecords <K, V> record); void onCommit (Map<TopicPartition , OffsetAndMetadata > offsets); void close(); } KafkaStreams Configured using : main.consumer.interceptor.classes How far behind the Kafka Streams consumers are from the producers ? Is the Kafka Streams application ready to process records and can serve interactive queries ?
  • 32. . Azkarra supports a REST API for managing, monitoring and querying Kafka Streams instances. ■ Provides support for Interactive Queries ■ Built-in authentication and authorization mechanisms (Basic Auth, SSL 2-Way). ■ Allows registration of new JAX-RS resources using plugin interface: AzkarraRestExtension 32 Azkarra REST API ● Get information about the local streams instance GET /api/v1/streams ● Get the status for the streams instance GET /api/v1/streams/(string: id)/status ● Get the configuration for the streams instance GET /api/v1/streams/(string: id)/config ● Get current metrics for the streams instance GET /api/v1/streams/(string: applicationId)/metrics ● Get all metrics in Prometheus format GET /prometheus Micrometer Prometheus
  • 33. . . . Azkarra can be configured for periodically reporting the internal states of a KafkaStreams instance. ■ Use StreamLifecycleInterceptor: ⎼ MonitoringStreamsInterceptor ■ Accepts a pluggable reporter class ⎼ Default : KafkaMonitoringReporter ⎼ Publishes events that adhere to the CloudEvents specification. 33 Putting it all together Exporting Kafka Streams States Anywhere { "id": "appid:word-count;appsrv:localhost:8080;ts:1620691200000", "source": "azkarra/ks/localhost:8080", "specversion": "1.0", "type": "io.streamthoughts.azkarra.streams.stateupdateevent", "time": "2021-05-11T00:00:00.000+0000", "datacontenttype": "application/json", "ioazkarramonitorintervalms": 10000, "ioazkarrastreamsappid": "word-count", "ioazkarraversion": "0.9.2", "ioazkarrastreamsappserver": "localhost:8080", "data": { "state": "RUNNING", "threads": [ { "name": "word-count-...-93e9a84057ad-StreamThread-1", "state": "RUNNING", "active_tasks": [], "standby_tasks": [], "clients": {} } ], "offsets": { "group": "", "consumers": [] }, "stores": { "partitionRestoreInfos": [], "partitionLagInfos": [] }, "state_changed_time": 1620691200000 } } Cloud Events
  • 34. ▢ Test the app is working as expected ▢ Externalize configuration ▢ Handle transient errors ▢ Handle deserialization exceptions Packaging (we have still 5’ left) 😬 Our TODO list 34 ▢ Expose the state of the Kafka Streams application ▢ Be able to monitor offsets and lags of consumers and state stores ▢ Interactive Queries (optional) ▢ Package the Kafka Streams application for production
  • 35. . Azkarra-based applications can be packaged as any other Kafka Streams apps. Azkarra Worker → An empty Azkarra application ■ Topologies and components can be loaded from an external uber-jar ⎼ Similar to Kafka Connect plugins and connectors ■ Can be used as the base image for Docker ⎼ Use Jib to build optimized Docker images for Java 35 Packaging Kafka Streams with Azkarra $ docker run --net host streamthoughts/azkarra-streams-worker:latest -v ./application.conf=/etc/azkarra/azkarra.conf -v ./local-topologies=/usr/share/azkarra-components/ streamthoughts/azkarra-streams-worker Jib + Docker + Azkarra = ❤
  • 36. . Using Kubernetes, topologies can be downloaded and mount using an init-container. 36 Deploying Kafka Streams with Azkarra (in Kubernetes) Deployment, StatefulSet, or... Container (image: azkarra-worker) InitContainer my-topology-with-dependencies-1.0.jar HTTP GET / Repository Manager e.g., Nexus / Artifactory Shared volume /var/lib/components/ azkarra.component.paths
  • 37. ▢ Test the app is working as expected ▢ Externalize configuration ▢ Handle transient errors ▢ Handle deserialization exceptions In less than 30 min using Azkarra🚀 DONE 37 ▢ Expose the state of the Kafka Streams application ▢ Be able to monitor offsets and lags of consumers and state stores ▢ Interactive Queries (optional) ▢ Package the Kafka Streams application for production
  • 38. 38 Demo (new coins...we have still 5’ left)🤫
  • 39. . Kafka Streams is a very good choice to quickly create streaming applications. But, building applications for production can be a lot of work. Azkarra aims to be a fast path for production by providing all the cool features you need: ■ Built-in mechanisms for handling exceptions ■ Built-in REST API for executing Interactive Queries. ■ Consumers Offsets Lag ■ Topology Visualization ■ Dashboard UI Take Aways Conclusion 39
  • 40. . ■ Add support for querying stale stores. ■ Add support for deploying and managing Kafka Streams topologies directly into Kubernetes ❏ i.e., KubStreamsExecutionEnvironment ■ Enhance the WebUI to add some visualizations for the key metrics to monitor. Take Aways Roadmap 40
  • 41. . Official Website: https://www.azkarrastreams.io/ GitHub: https://github.com/streamthoughts/azkarra-streams (for contributing and adding⭐) Slack: https://communityinviter.com/apps/azkarra-streams/azkarra-streams-community Demo: https://github.com/streamthoughts/demo-kafka-streams-scottify Take Aways Links 41 Join us on Slack!
  • 42. Thank you @fhussonnois Florian HUSSONNOIS ▪ florian@streamthoughts.io
  • 45. . Images ■ Photo by Mark König on Unsplash ■ Photo by CHUTTERSNAP on Unsplash 45 Images & Icons