SlideShare ist ein Scribd-Unternehmen logo
1 von 44
1Confidential
Stream Application
Development with
Apache® KafkaTM
Matthias J. Sax | Software Engineer
matthias@confluent.io
@MatthiasJSax
2Confidential
Apache Kafka is …
...a distributed streaming platform
Consumers
Producers
Connectors
Processing
3Confidential
Confluent is ...
…a company founded by the original creators of Apache Kafka
...a distributed streaming platform
• Built on Apache Kafka
• Confluent Open Source
• Confluent Enterprise
All components but Kafka
are optional to run Confluent.
Mix-and-match them as required.
…a company founded by the original creators of Apache Kafka
4Confidential
Kafka Streams is ...
… the easiest way to process data in Kafka (as of v0.10)
• Easy to use library
• Real stream processing / record by record / ms latency
• DSL
• Focus on applications
• No cluster / “cluster to-go”
• ”DB to-go”
• Expressive
• Single record transformations
• Aggregations / Joins
• Time, windowing, out-of-order data
• Stream-table duality
• Tightly integrated within Kafka
• Fault-tolerant
• Scalable (s/m/l/xl), elastic
• Encryption, authentication, authorization
• Stateful
• Backed by Kafka
• Queryable / “DB to-go”
• Date reprocessing
• Application “reset button”
5Confidential
Before Kafka Streams
Do-it-yourself stream Processing
• Hard to get right / lots of “glue code”
• Fault-tolerance / scalability … ???
plain
consumer/produc
er clients
6Confidential
Before Kafka Streams
7Confidential
Before Kafka Streams
Do-it-yourself stream Processing
• Hard to get right / lots of “clue code”
• Fault-tolerance / scalability … ???
Using a framework
• Requires a cluster
• Bare metal – hard to manage
• YARN / Mesos
• Test locally – deploy remotely
• “Can you please deploy my code?”
• Jar und dependency hell
How does you application interact with
you stream processing job?
plain
consumer/produc
er clients
and
others...
8Confidential
Before Kafka Streams
9Confidential
Build apps, not clusters!
10Confidential
Easy to use!
$ java -cp MyApp.jar 
io.confluent.MyApp
11Confidential
Easy to integrate!
12Confidential
Queryable / “DB to-go”
13Confidential
How to install Kafka Streams?
Not at all. It’s a library.
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>0.10.1.0</version>
</dependency>
14Confidential
How do I deploy my app?
Whatever works for you. It’s just an app as any other!
15Confidential
If it’s just a regular application…
• How does it scale?
• How can it be fault-tolerant?
• How does it handle distributed state?
Off-load hard problems to brokers.
• Kafka is a streaming platform: no need to reinvent the wheel
• Exploit consumer groups and group management protocol
16Confidential
Scaling
17Confidential
Scaling
18Confidential
Scaling
Easy to scale!
It’s elastic!
“cluster to-go”
19Confidential
Fault-tolerance
Rebalance
Consumer
Group
20Confidential
Distributed State
State stores
21Confidential
Distributed State
State stores
22Confidential
Distributed State
23Confidential
Yes it’s complicated…
API, coding
Org. processes
Reality™
Operations
Security
…
Architecture
24Confidential
But…
API, coding
Org. processes
Reality™
Operations
Security
…
Architecture
You
Kafka core / Kafka Streams
25Confidential
KStream/KTable
• KStream
• Record stream
• Each record describes an event in the real world
• Example: click stream
• KTable
• Changelog stream
• Each record describes a change to a previous record
• Example: position report stream
• In Kafka Streams:
• KTable holds a materialized view of the latest update per key as internal state
26Confidential
KTable
User profile/location information
alice paris bob zurich alice berlin
Changelog stream
alice paris
KTable state
alice paris
KTable state
bob zurich
alice berlin
KTable state
bob zurich
27Confidential
KTable (count moves)
alice paris bob zurich alice berlin
Record stream
alice 0
KTable state
count()
Changelog stream (output)
alice 0
alice 0
KTable state
bob 0
count()
bob 0
alice 1
KTable state
bob 0
count()
alice 1
28Confidential
KTable (cont.)
• Internal state:
• Continuously updating materialized view of the latest status
• Downstream result (“output”)
• Changelog stream, describing every update to the materialized view
KStream stream = …
KTable table = stream.aggregate(...)
It’s the changelog!
29Confidential
KStream/KTable
30Confidential
Time and Windows
• Event time (default)
• Create time
• (Broker) Ingestion time
• Customized
• (Hopping) Time windows
• Overlapping or non-overlapping (tumbling)
• For aggregations
• Processing Time
• Sliding windows
• For KStream-KStream joins
KStream stream = …
KTable table = stream.aggregate(TimeWindow.of(10 * 1000), ...);
31Confidential
KTable Semantics
• Non-windowed:
• State is kept forever:
• Out-of-order/late-arriving records can be handled straightforward
• KTable aggregation can be viewed as a landmark window (ie, window size ==
infinite)
• Output is a changelog stream
• Windowed:
• Windows (ie, state) is kept ”forever” (well, there is a configurable retention time)
• Out-of-order/late-arriving records can be handled straightforward
• Output is a changelog stream
• No watermarks required
• Early updates/results
32Confidential
Show Code!
33Confidential
Page Views per Region
Stream/Table
joinClick Stream
Profile Changelog
key val
Current User Info
Cn
t
PageViews per
Region
<userId:region>
<userId:page>
<region:page>
34Confidential
Page Views per Region
final KStreamBuilder builder = new KStreamBuilder();
// read record stream from topic “PageView” and changelog stream from topic “UserProfiles”
final KStream<String, String> views = builder.stream("PageViews"); // <userId : page>
final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region>
35Confidential
Page Views per Region
final KStreamBuilder builder = new KStreamBuilder();
// read record stream from topic “PageView” and changelog stream from topic “UserProfiles”
final KStream<String, String> views = builder.stream("PageViews"); // <userId : page>
final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region>
// enrich page views with user’s region -- stream-table-join
final KStream<String, String> viewsWithRegionKey = views.leftJoin(userProfiles,
(page, userRegion) -> page + “,” + userRegion )
// and set “region” as new key
.map( (userId, pageAndRegion) -> new KeyValue<>(pageAndRegion.split(“,”)[1], pageAndRegion.split(“,”)[0]) );
36Confidential
Page Views per Region
final KStreamBuilder builder = new KStreamBuilder();
// read record stream from topic “PageView” and changelog stream from topic “UserProfiles”
final KStream<String, String> views = builder.stream("PageViews"); // <userId : page>
final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region>
// enrich page views with user’s region -- stream-table-join AND set “region” as new key
final KStream<String, String> viewsWithRegionKey = views.leftJoin(userProfiles, ...).map(...); // <region : page>
// count views by region, using hopping windows of size 5 minutes that advance every 1 minute
final KTable<Windowed<String>, Long> viewsPerRegion = viewsWithRegionKey
.groupByKey() // redistribute data
.count(TimeWindow.of(5 * 60 * 1000L).advanceBy(60 * 1000L), "GeoPageViewsStore");
37Confidential
Page Views per Region
final KStreamBuilder builder = new KStreamBuilder();
// read record stream from topic “PageView” and changelog stream from topic “UserProfiles”
final KStream<String, String> views = builder.stream("PageViews"); // <userId : page>
final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region>
// enrich page views with user’s region -- stream-table-join AND set “region” as new key
final KStream<String, String> viewsWithRegionKey = views.leftJoin(userProfiles, ...).map(...); // <region : page>
// count views by region, using hopping windows of size 5 minutes that advance every 1 minute
final KTable<Windowed<String>, Long> viewsByRegion = viewsWithRegionKey.groupByKey().count(TimeWindow.of(...)..., ...);
// write result
viewsByRegion.toStream( (windowedRegion, count) -> windowedRegion.toString() ) // prepare result
.to(stringSerde, longSerde, "PageViewsByRegion"); // write to topic “PageViewsByResion”
38Confidential
Page Views per Region
final KStreamBuilder builder = new KStreamBuilder();
// read record stream from topic “PageView” and changelog stream from topic “UserProfiles”
final KStream<String, String> views = builder.stream("PageViews"); // <userId : page>
final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region>
// enrich page views with user’s region -- stream-table-join AND set “region” as new key
final KStream<String, String> viewsWithRegionKey = views.leftJoin(userProfiles, ...).map(...); // <region : page>
// count views by region, using hopping windows of size 5 minutes that advance every 1 minute
final KTable<Windowed<String>, Long> viewsByRegion = viewsWithRegionKey.groupByKey().count(TimeWindow.of(...)..., ...);
// write result to topic “PageViewsByResion”
viewsByRegion.toStream(...).to(..., "PageViewsByRegion");
// start application
final KafkaStreams streams = new KafkaStreams(builder, streamsConfiguration); // streamsConfiguration omitted for brevity
streams.start();
// stop application
streams.close();
/* https://github.com/confluentinc/examples/blob/3.1.x/kafka-streams/src/main/java/io/confluent/examples/streams/PageViewRegionExample.java */
39Confidential
Interactive Queries
• KTable is a changelog stream with materialized internal view (state)
• KStream-KTable join can do lookups into the materialized view
• What if the application could do lookups, too?
https://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/
Yes, it can!
“DB to-go“
40Confidential
Interactive Queries
charlie 3bob 5 alice 2
New API to access
local state stores of
an app instance
41Confidential
Interactive Queries
charlie 3bob 5 alice 2
New API to discover
running app instances
“host1:4460” “host5:5307” “host3:4777”
42Confidential
Interactive Queries
charlie 3bob 5 alice 2
You: inter-app communication (RPC layer)
43Confidential
Wrapping Up
• Kafka Streams is available in Apache Kafka 0.10 and Confluent Platform 3.1
• http://kafka.apache.org/
• http://www.confluent.io/download (OS + enterprise versions, tar/zip/deb/rpm)
• Kafka Streams demos at https://github.com/confluentinc/examples
• Java 7, Java 8+ with lambdas, and Scala
• WordCount, Joins, Avro integration, Top-N computation, Windowing, Interactive Queries
• Apache Kafka documentation: http://kafka.apache.org/documentation.html
• Confluent documentation: http://docs.confluent.io/current/streams/
• Quickstart, Concepts, Architecture, Developer Guide, FAQ
• Join our bi-weekly Confluent Developer Roundtable sessions on Kafka Streams
• Contact me at matthias@confluent.io for detail
44Confidential
Thank You
We are hiring!

Weitere ähnliche Inhalte

Was ist angesagt?

Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Guozhang Wang
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsGuozhang Wang
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLconfluent
 
Robust Operations of Kafka Streams
Robust Operations of Kafka StreamsRobust Operations of Kafka Streams
Robust Operations of Kafka Streamsconfluent
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Yaroslav Tkachenko
 
Kafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Streams: The Stream Processing Engine of Apache KafkaKafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Streams: The Stream Processing Engine of Apache KafkaEno Thereska
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database Systemconfluent
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...HostedbyConfluent
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafkaconfluent
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...HostedbyConfluent
 
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...confluent
 
Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017confluent
 
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...confluent
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19confluent
 
Apache Kafka: New Features That You Might Not Know About
Apache Kafka: New Features That You Might Not Know AboutApache Kafka: New Features That You Might Not Know About
Apache Kafka: New Features That You Might Not Know AboutYaroslav Tkachenko
 
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...HostedbyConfluent
 

Was ist angesagt? (20)

Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
Consistency and Completeness: Rethinking Distributed Stream Processing in Apa...
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
KSQL Intro
KSQL IntroKSQL Intro
KSQL Intro
 
Apache Kafka Streams
Apache Kafka StreamsApache Kafka Streams
Apache Kafka Streams
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams Applications
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
 
Robust Operations of Kafka Streams
Robust Operations of Kafka StreamsRobust Operations of Kafka Streams
Robust Operations of Kafka Streams
 
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
 
Kafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Streams: The Stream Processing Engine of Apache KafkaKafka Streams: The Stream Processing Engine of Apache Kafka
Kafka Streams: The Stream Processing Engine of Apache Kafka
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
 
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
 
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
Apache kafka meet_up_zurich_at_swissre_from_zero_to_hero_with_kafka_connect_2...
 
Stream Processing made simple with Kafka
Stream Processing made simple with KafkaStream Processing made simple with Kafka
Stream Processing made simple with Kafka
 
Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017Exactly-once Data Processing with Kafka Streams - July 27, 2017
Exactly-once Data Processing with Kafka Streams - July 27, 2017
 
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
 
Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19Kick your database_to_the_curb_reston_08_27_19
Kick your database_to_the_curb_reston_08_27_19
 
Apache Kafka: New Features That You Might Not Know About
Apache Kafka: New Features That You Might Not Know AboutApache Kafka: New Features That You Might Not Know About
Apache Kafka: New Features That You Might Not Know About
 
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
Introducing KSML: Kafka Streams for low code environments | Jeroen van Dissel...
 

Andere mochten auch

Kafka Connect & Kafka Streams - Paris Kafka User Group
Kafka Connect & Kafka Streams - Paris Kafka User GroupKafka Connect & Kafka Streams - Paris Kafka User Group
Kafka Connect & Kafka Streams - Paris Kafka User GroupHervé Rivière
 
Kafka的设计与实现
Kafka的设计与实现Kafka的设计与实现
Kafka的设计与实现wang xing
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureHortonworks
 
Streaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideStreaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideBen Stopford
 
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Hortonworks
 
Shortening the feedback loop
Shortening the feedback loopShortening the feedback loop
Shortening the feedback loopJosh Baer
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Lightbend
 

Andere mochten auch (8)

Kafka Connect & Kafka Streams - Paris Kafka User Group
Kafka Connect & Kafka Streams - Paris Kafka User GroupKafka Connect & Kafka Streams - Paris Kafka User Group
Kafka Connect & Kafka Streams - Paris Kafka User Group
 
Kafka的设计与实现
Kafka的设计与实现Kafka的设计与实现
Kafka的设计与实现
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Streaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the DivideStreaming, Database & Distributed Systems Bridging the Divide
Streaming, Database & Distributed Systems Bridging the Divide
 
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
 
Shortening the feedback loop
Shortening the feedback loopShortening the feedback loop
Shortening the feedback loop
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
Building Streaming And Fast Data Applications With Spark, Mesos, Akka, Cassan...
 

Ähnlich wie Stream Application Development with Apache Kafka

Event Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDBEvent Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDBScyllaDB
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaQAware GmbH
 
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Dan Halperin
 
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드confluent
 
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...Kai Wähner
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Servicesconfluent
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018Paolo Castagna
 
Kafka-and-event-driven-architecture-OGYatra20.ppt
Kafka-and-event-driven-architecture-OGYatra20.pptKafka-and-event-driven-architecture-OGYatra20.ppt
Kafka-and-event-driven-architecture-OGYatra20.pptInam Bukhary
 
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDK
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDKBigQuery case study in Groovenauts & Dive into the DataflowJavaSDK
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDKnagachika t
 
DevOps, Microservices and Serverless Architecture
DevOps, Microservices and Serverless ArchitectureDevOps, Microservices and Serverless Architecture
DevOps, Microservices and Serverless ArchitectureMikhail Prudnikov
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniMonal Daxini
 
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward
 
Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)Sangjin Lee
 
Timeline service V2 at the Hadoop Summit SJ 2016
Timeline service V2 at the Hadoop Summit SJ 2016Timeline service V2 at the Hadoop Summit SJ 2016
Timeline service V2 at the Hadoop Summit SJ 2016Vrushali Channapattan
 
Kafka and event driven architecture -og yatra20
Kafka and event driven architecture -og yatra20Kafka and event driven architecture -og yatra20
Kafka and event driven architecture -og yatra20Vinay Kumar
 
Kafka and event driven architecture -apacoug20
Kafka and event driven architecture -apacoug20Kafka and event driven architecture -apacoug20
Kafka and event driven architecture -apacoug20Vinay Kumar
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZconfluent
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with storesYoni Farin
 
Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Dmitry Skaredov
 

Ähnlich wie Stream Application Development with Apache Kafka (20)

Event Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDBEvent Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDB
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
 
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
Introduction to Apache Beam & No Shard Left Behind: APIs for Massive Parallel...
 
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
Confluent Workshop Series: ksqlDB로 스트리밍 앱 빌드
 
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
 
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud ServicesBuild a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
Confluent and Elastic: a Lovely Couple - Elastic Stack in a Day 2018
 
Kafka-and-event-driven-architecture-OGYatra20.ppt
Kafka-and-event-driven-architecture-OGYatra20.pptKafka-and-event-driven-architecture-OGYatra20.ppt
Kafka-and-event-driven-architecture-OGYatra20.ppt
 
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDK
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDKBigQuery case study in Groovenauts & Dive into the DataflowJavaSDK
BigQuery case study in Groovenauts & Dive into the DataflowJavaSDK
 
DevOps, Microservices and Serverless Architecture
DevOps, Microservices and Serverless ArchitectureDevOps, Microservices and Serverless Architecture
DevOps, Microservices and Serverless Architecture
 
Unbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxiniUnbounded bounded-data-strangeloop-2016-monal-daxini
Unbounded bounded-data-strangeloop-2016-monal-daxini
 
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
Flink Forward San Francisco 2019: Building production Flink jobs with Airstre...
 
Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)Timeline Service v.2 (Hadoop Summit 2016)
Timeline Service v.2 (Hadoop Summit 2016)
 
Timeline service V2 at the Hadoop Summit SJ 2016
Timeline service V2 at the Hadoop Summit SJ 2016Timeline service V2 at the Hadoop Summit SJ 2016
Timeline service V2 at the Hadoop Summit SJ 2016
 
Kafka and event driven architecture -og yatra20
Kafka and event driven architecture -og yatra20Kafka and event driven architecture -og yatra20
Kafka and event driven architecture -og yatra20
 
Kafka and event driven architecture -apacoug20
Kafka and event driven architecture -apacoug20Kafka and event driven architecture -apacoug20
Kafka and event driven architecture -apacoug20
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
 
Kafka streams decoupling with stores
Kafka streams decoupling with storesKafka streams decoupling with stores
Kafka streams decoupling with stores
 
Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1Spring Cloud and Netflix OSS overview v1
Spring Cloud and Netflix OSS overview v1
 

Kürzlich hochgeladen

VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 

Kürzlich hochgeladen (20)

VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 

Stream Application Development with Apache Kafka

  • 1. 1Confidential Stream Application Development with Apache® KafkaTM Matthias J. Sax | Software Engineer matthias@confluent.io @MatthiasJSax
  • 2. 2Confidential Apache Kafka is … ...a distributed streaming platform Consumers Producers Connectors Processing
  • 3. 3Confidential Confluent is ... …a company founded by the original creators of Apache Kafka ...a distributed streaming platform • Built on Apache Kafka • Confluent Open Source • Confluent Enterprise All components but Kafka are optional to run Confluent. Mix-and-match them as required. …a company founded by the original creators of Apache Kafka
  • 4. 4Confidential Kafka Streams is ... … the easiest way to process data in Kafka (as of v0.10) • Easy to use library • Real stream processing / record by record / ms latency • DSL • Focus on applications • No cluster / “cluster to-go” • ”DB to-go” • Expressive • Single record transformations • Aggregations / Joins • Time, windowing, out-of-order data • Stream-table duality • Tightly integrated within Kafka • Fault-tolerant • Scalable (s/m/l/xl), elastic • Encryption, authentication, authorization • Stateful • Backed by Kafka • Queryable / “DB to-go” • Date reprocessing • Application “reset button”
  • 5. 5Confidential Before Kafka Streams Do-it-yourself stream Processing • Hard to get right / lots of “glue code” • Fault-tolerance / scalability … ??? plain consumer/produc er clients
  • 7. 7Confidential Before Kafka Streams Do-it-yourself stream Processing • Hard to get right / lots of “clue code” • Fault-tolerance / scalability … ??? Using a framework • Requires a cluster • Bare metal – hard to manage • YARN / Mesos • Test locally – deploy remotely • “Can you please deploy my code?” • Jar und dependency hell How does you application interact with you stream processing job? plain consumer/produc er clients and others...
  • 10. 10Confidential Easy to use! $ java -cp MyApp.jar io.confluent.MyApp
  • 13. 13Confidential How to install Kafka Streams? Not at all. It’s a library. <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>0.10.1.0</version> </dependency>
  • 14. 14Confidential How do I deploy my app? Whatever works for you. It’s just an app as any other!
  • 15. 15Confidential If it’s just a regular application… • How does it scale? • How can it be fault-tolerant? • How does it handle distributed state? Off-load hard problems to brokers. • Kafka is a streaming platform: no need to reinvent the wheel • Exploit consumer groups and group management protocol
  • 18. 18Confidential Scaling Easy to scale! It’s elastic! “cluster to-go”
  • 23. 23Confidential Yes it’s complicated… API, coding Org. processes Reality™ Operations Security … Architecture
  • 25. 25Confidential KStream/KTable • KStream • Record stream • Each record describes an event in the real world • Example: click stream • KTable • Changelog stream • Each record describes a change to a previous record • Example: position report stream • In Kafka Streams: • KTable holds a materialized view of the latest update per key as internal state
  • 26. 26Confidential KTable User profile/location information alice paris bob zurich alice berlin Changelog stream alice paris KTable state alice paris KTable state bob zurich alice berlin KTable state bob zurich
  • 27. 27Confidential KTable (count moves) alice paris bob zurich alice berlin Record stream alice 0 KTable state count() Changelog stream (output) alice 0 alice 0 KTable state bob 0 count() bob 0 alice 1 KTable state bob 0 count() alice 1
  • 28. 28Confidential KTable (cont.) • Internal state: • Continuously updating materialized view of the latest status • Downstream result (“output”) • Changelog stream, describing every update to the materialized view KStream stream = … KTable table = stream.aggregate(...) It’s the changelog!
  • 30. 30Confidential Time and Windows • Event time (default) • Create time • (Broker) Ingestion time • Customized • (Hopping) Time windows • Overlapping or non-overlapping (tumbling) • For aggregations • Processing Time • Sliding windows • For KStream-KStream joins KStream stream = … KTable table = stream.aggregate(TimeWindow.of(10 * 1000), ...);
  • 31. 31Confidential KTable Semantics • Non-windowed: • State is kept forever: • Out-of-order/late-arriving records can be handled straightforward • KTable aggregation can be viewed as a landmark window (ie, window size == infinite) • Output is a changelog stream • Windowed: • Windows (ie, state) is kept ”forever” (well, there is a configurable retention time) • Out-of-order/late-arriving records can be handled straightforward • Output is a changelog stream • No watermarks required • Early updates/results
  • 33. 33Confidential Page Views per Region Stream/Table joinClick Stream Profile Changelog key val Current User Info Cn t PageViews per Region <userId:region> <userId:page> <region:page>
  • 34. 34Confidential Page Views per Region final KStreamBuilder builder = new KStreamBuilder(); // read record stream from topic “PageView” and changelog stream from topic “UserProfiles” final KStream<String, String> views = builder.stream("PageViews"); // <userId : page> final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region>
  • 35. 35Confidential Page Views per Region final KStreamBuilder builder = new KStreamBuilder(); // read record stream from topic “PageView” and changelog stream from topic “UserProfiles” final KStream<String, String> views = builder.stream("PageViews"); // <userId : page> final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region> // enrich page views with user’s region -- stream-table-join final KStream<String, String> viewsWithRegionKey = views.leftJoin(userProfiles, (page, userRegion) -> page + “,” + userRegion ) // and set “region” as new key .map( (userId, pageAndRegion) -> new KeyValue<>(pageAndRegion.split(“,”)[1], pageAndRegion.split(“,”)[0]) );
  • 36. 36Confidential Page Views per Region final KStreamBuilder builder = new KStreamBuilder(); // read record stream from topic “PageView” and changelog stream from topic “UserProfiles” final KStream<String, String> views = builder.stream("PageViews"); // <userId : page> final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region> // enrich page views with user’s region -- stream-table-join AND set “region” as new key final KStream<String, String> viewsWithRegionKey = views.leftJoin(userProfiles, ...).map(...); // <region : page> // count views by region, using hopping windows of size 5 minutes that advance every 1 minute final KTable<Windowed<String>, Long> viewsPerRegion = viewsWithRegionKey .groupByKey() // redistribute data .count(TimeWindow.of(5 * 60 * 1000L).advanceBy(60 * 1000L), "GeoPageViewsStore");
  • 37. 37Confidential Page Views per Region final KStreamBuilder builder = new KStreamBuilder(); // read record stream from topic “PageView” and changelog stream from topic “UserProfiles” final KStream<String, String> views = builder.stream("PageViews"); // <userId : page> final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region> // enrich page views with user’s region -- stream-table-join AND set “region” as new key final KStream<String, String> viewsWithRegionKey = views.leftJoin(userProfiles, ...).map(...); // <region : page> // count views by region, using hopping windows of size 5 minutes that advance every 1 minute final KTable<Windowed<String>, Long> viewsByRegion = viewsWithRegionKey.groupByKey().count(TimeWindow.of(...)..., ...); // write result viewsByRegion.toStream( (windowedRegion, count) -> windowedRegion.toString() ) // prepare result .to(stringSerde, longSerde, "PageViewsByRegion"); // write to topic “PageViewsByResion”
  • 38. 38Confidential Page Views per Region final KStreamBuilder builder = new KStreamBuilder(); // read record stream from topic “PageView” and changelog stream from topic “UserProfiles” final KStream<String, String> views = builder.stream("PageViews"); // <userId : page> final KTable<String, String> userProfiles = builder.table("UserProfiles", "UserProfilesStore"); // <userId : region> // enrich page views with user’s region -- stream-table-join AND set “region” as new key final KStream<String, String> viewsWithRegionKey = views.leftJoin(userProfiles, ...).map(...); // <region : page> // count views by region, using hopping windows of size 5 minutes that advance every 1 minute final KTable<Windowed<String>, Long> viewsByRegion = viewsWithRegionKey.groupByKey().count(TimeWindow.of(...)..., ...); // write result to topic “PageViewsByResion” viewsByRegion.toStream(...).to(..., "PageViewsByRegion"); // start application final KafkaStreams streams = new KafkaStreams(builder, streamsConfiguration); // streamsConfiguration omitted for brevity streams.start(); // stop application streams.close(); /* https://github.com/confluentinc/examples/blob/3.1.x/kafka-streams/src/main/java/io/confluent/examples/streams/PageViewRegionExample.java */
  • 39. 39Confidential Interactive Queries • KTable is a changelog stream with materialized internal view (state) • KStream-KTable join can do lookups into the materialized view • What if the application could do lookups, too? https://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/ Yes, it can! “DB to-go“
  • 40. 40Confidential Interactive Queries charlie 3bob 5 alice 2 New API to access local state stores of an app instance
  • 41. 41Confidential Interactive Queries charlie 3bob 5 alice 2 New API to discover running app instances “host1:4460” “host5:5307” “host3:4777”
  • 42. 42Confidential Interactive Queries charlie 3bob 5 alice 2 You: inter-app communication (RPC layer)
  • 43. 43Confidential Wrapping Up • Kafka Streams is available in Apache Kafka 0.10 and Confluent Platform 3.1 • http://kafka.apache.org/ • http://www.confluent.io/download (OS + enterprise versions, tar/zip/deb/rpm) • Kafka Streams demos at https://github.com/confluentinc/examples • Java 7, Java 8+ with lambdas, and Scala • WordCount, Joins, Avro integration, Top-N computation, Windowing, Interactive Queries • Apache Kafka documentation: http://kafka.apache.org/documentation.html • Confluent documentation: http://docs.confluent.io/current/streams/ • Quickstart, Concepts, Architecture, Developer Guide, FAQ • Join our bi-weekly Confluent Developer Roundtable sessions on Kafka Streams • Contact me at matthias@confluent.io for detail