SlideShare ist ein Scribd-Unternehmen logo
1 von 82
1Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Introducing Kafka Streams
Michael Noll
michael@confluent.io
Berlin Buzzwords, June 06, 2016
miguno
2Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
3Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
4Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
5Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
6Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
7Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Stream processing in Real Life™
…before Kafka Streams
…somewhat exaggerated
…but perhaps not that much
8Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016Abandon all hope, ye who enter here.
9Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
How did this… (#machines == 1)
scala> val input = (1 to 6).toSeq
// Stateless computation
scala> val doubled = input.map(_ * 2)
Vector(2, 4, 6, 8, 10, 12)
// Stateful computation
scala> val sumOfOdds = input.filter(_ % 2 != 0).reduceLeft(_ + _)
res2: Int = 9
10Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
…turn into stuff like this (#machines > 1)
11Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
12Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
13Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
14Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016Taken at a session at ApacheCon: Big Data, Hungary, September 2015
15Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams
stream processing made simple
16Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams
• Powerful yet easy-to use Java library
• Part of open source Apache Kafka, introduced in v0.10, May 2016
• Source code: https://github.com/apache/kafka/tree/trunk/streams
• Build your own stream processing applications that are
• highly scalable
• fault-tolerant
• distributed
• stateful
• able to handle late-arriving, out-of-order data
• <more on this later>
17Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams
18Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams
19Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
What is Kafka Streams: Unix analogy
$ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt
Kafka
Kafka Connect Kafka Streams
20Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
What is Kafka Streams: Java analogy
21Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
When to use Kafka Streams (as of Kafka 0.10)
Recommended use cases
• Application Development
• “Fast Data” apps (small or big
data)
• Reactive and stateful
applications
• Linear streams
• Event-driven systems
• Continuous transformations
• Continuous queries
• Microservices
Questionable use cases
• Data Science / Data
Engineering
• “Heavy lifting”
• Data mining
• Non-linear, branching streams
(graphs)
• Machine learning, number
crunching
• What you’d do in a data
warehouse
22Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Alright, can you show me some code now? 
KStream<Integer, Integer> input = builder.stream(“numbers-topic”);
// Stateless computation
KStream<Integer, Integer> doubled = input.mapValues(v -> v * 2);
// Stateful computation
KTable<Integer, Integer> sumOfOdds = input
.filter((k,v) -> v % 2 != 0)
.selectKey((k, v) -> 1)
.reduceByKey((v1, v2) -> v1 + v2, ”sum-of-odds");
• API option 1: Kafka Streams DSL (declarative)
23Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Alright, can you show me some code now? 
Startup
Process a record
Periodic action
Shutdown
• API option 2: low-level Processor API (imperative)
24Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
API != everything
API, coding
Operations, debugging, …
“Full stack” evaluation
25Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
API != everything
API, coding
Operations, debugging, …
“Full stack” evaluation
26Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams outsources hard problems to Kafka
27Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
How do I install Kafka Streams?
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>0.10.0.0</version>
</dependency>
• There is and there should be no “install”.
• It’s a library. Add it to your app like any other library.
28Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
29Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Do I need to install a CLUSTER to run my apps?
• No, you don’t. Kafka Streams allows you to stay lean and lightweight.
• Unlearn bad habits: “do cool stuff with data != must have cluster”
Ok. Ok. Ok. Ok.
30Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
How do I package and deploy my apps? How do I …?
31Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
32Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
How do I package and deploy my apps? How do I …?
• Whatever works for you. Stick to what you/your company think is the
best way.
• Why? Because an app that uses Kafka Streams is…a normal Java app.
• Your Ops/SRE/InfoSec teams may finally start to love not hate you.
33Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka
concepts
34Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka concepts
35Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka concepts
36Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams
concepts
37Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Stream
38Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Processor topology
39Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Stream partitions and stream tasks
40Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables
http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simpl
http://docs.confluent.io/3.0.0/streams/concepts.html#duality-of-streams-and-tables
41Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
alice 2 bob 10 alice 3
time
“Alice clicked 2 times.”
“Alice clicked 2 times.”KTable
= interprets data as changelog stream
~ is a continuously updated materialized view
KStream
= interprets data as record stream
42Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
alice 2 bob 10 alice 3
time
“Alice clicked 2+3 = 5
times.”
“Alice clicked 2 3 times.”KTable
= interprets data as changelog stream
~ is a continuously updated materialized view
KStream
= interprets data as record stream
43Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
alice eggs bob lettuce alice milk
alice paris bob zurich alice berlin
KStream
KTable
time
“Alice bought eggs.”
“Alice is currently in Paris.”
User purchase records
User profile/location information
44Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
alice eggs bob lettuce alice milk
alice paris bob zurich alice berlin
KStream
KTable
User purchase records
User profile/location information
time
“Alice bought eggs and milk.”
“Alice is currently in Paris Berlin.”
Show me
ALL values for
a key
Show me the
LATEST value
for a key
45Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
46Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
• JOIN example: compute user clicks by region via
KStream.leftJoin(KTable)
47Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
• JOIN example: compute user clicks by region via
KStream.leftJoin(KTable)
Even simpler in Scala because, unlike Java, it natively supports tuples:
48Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
• JOIN example: compute user clicks by region via
KStream.leftJoin(KTable)
alice 13 bob 5Input KStream
alice
(europe,
13)
bob (europe, 5)leftJoin()
w/ KTable
KStream
13europe 5europemap() KStream
KTablereduceByKey(_ + _) 13europe
……
……
18europe
……
……
49Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
50Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Streams meet Tables – in the Kafka Streams DSL
51Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams
key features
52Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Also inherits Kafka’s security model, e.g. to encrypt data-in-transit
• Uses Kafka as its internal messaging layer, too
53Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Native Kafka integration
• Reading data from Kafka
• Writing data to Kafka
54Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Native Kafka integration
• You can configure both Kafka Streams plus the underlying Kafka clients
55Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Also inherits Kafka’s security model, e.g. to encrypt data-in-transit
• Uses Kafka as its internal messaging layer, too
• Highly scalable
• Fault-tolerant
• Elastic
56Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Execution model
57Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Execution model
58Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Execution model
59Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Execution model
60Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams outsources hard problems to Kafka
61Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Also inherits Kafka’s security model, e.g. to encrypt data-in-transit
• Uses Kafka as its internal messaging layer, too
• Highly scalable
• Fault-tolerant
• Elastic
• Stateful and stateless computations (e.g. joins, aggregations)
62Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Stateful computations
• Stateful computations like aggregations or joins require state
• We already showed a join example in the previous slides.
• Windowing a stream is stateful, too, but let’s ignore this for now.
• State stores in Kafka Streams
• Typically: key-value stores
• Pluggable implementation: RocksDB (default), in-memory, your own …
• State stores are per stream task for isolation (think: share-nothing)
• State stores are local for best performance
• State stores are replicated to Kafka for elasticity and for fault-tolerance
63Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Execution model
State stores
(This is a bit simplified.)
64Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Remember?
65Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Execution model
State stores
(This is a bit simplified.)
charlie 3
bob 1
alice 1
alice 2
66Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Execution model
State stores
(This is a bit simplified.)
67Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Execution model
State stores
(This is a bit simplified.)
alice 1
alice 2
68Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Kafka Streams outsources hard problems to Kafka
69Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Stateful computations
• Kafka Streams DSL: abstracts state stores away from you
• Stateful operations include
• count(), reduceByKey(), aggregate(), …
• Low-level Processor API: direct access to state stores
• Very flexible but more manual work for you
70Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Stateful computations
• Use the low-level Processor API to interact directly with state stores
Get the store
Use the store
71Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Also inherits Kafka’s security model, e.g. to encrypt data-in-transit
• Uses Kafka as its internal messaging layer, too
• Highly scalable
• Fault-tolerant
• Elastic
• Stateful and stateless computations
• Time model
72Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Time
73Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Time
74Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Time
• You configure the desired time semantics through timestamp extractors
• Default extractor yields event-time semantics
• Extracts embedded timestamps of Kafka messages (introduced in v0.10)
75Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Also inherits Kafka’s security model, e.g. to encrypt data-in-transit
• Uses Kafka as its internal messaging layer, too
• Highly scalable
• Fault-tolerant
• Elastic
• Stateful and stateless computations
• Time model
• Windowing
76Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Windowing
TimeWindows.of(3000)
TimeWindows.of(3000).advanceBy(1000)
77Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Windowing use case: monitoring (1m/5m/15m
averages)
Confluent Control Center for Kafka
78Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Key features in 0.10
• Native, 100%-compatible Kafka integration
• Also inherits Kafka’s security model, e.g. to encrypt data-in-transit
• Uses Kafka as its internal messaging layer, too
• Highly scalable
• Fault-tolerant
• Elastic
• Stateful and stateless computations
• Time model
• Windowing
• Supports late-arriving and out-of-order data
• Millisecond processing latency, no micro-batching
• At-least-once processing guarantees (exactly-once is in the works)
79Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Wrapping up
80Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Where to go from here?
• Kafka Streams is available in Apache Kafka 0.10 and Confluent Platform
3.0
• http://kafka.apache.org/
• http://www.confluent.io/download (free + enterprise versions,
tar/zip/deb/rpm)
• Kafka Streams demos at https://github.com/confluentinc/examples
• Java 7, Java 8+ with lambdas, and Scala
• WordCount, Joins, Avro integration, Top-N computation, Windowing, …
• Apache Kafka documentation:
http://kafka.apache.org/documentation.html
• Confluent documentation: http://docs.confluent.io/3.0.0/streams/
• Quickstart, Concepts, Architecture, Developer Guide, FAQ
• Join our bi-weekly Ask Me Anything sessions on Kafka Streams
• Contact me at michael@confluent.io for details
81Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Some of the things to come
• Exactly-once semantics
• Queriable state – tap into the state of your applications
• SQL interface
• Listen to and collaborate with the developer community
• Your feedback counts a lot! Share it via users@kafka.apache.org
Tomorrow’s keynote (09:30 AM) by Neha
Narkhede,
co-founder and CTO of Confluent
“Application development and data in the emerging
world of stream processing”
82Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
Want to contribute to Kafka and open source?
Join the Kafka community
http://kafka.apache.org/
Questions, comments? Tweet with #bbuzz and /cc to @ConfluentInc
…in a great team with the creators of Kafka?
Confluent is hiring 
http://confluent.io/

Weitere ähnliche Inhalte

Was ist angesagt?

Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingGuozhang Wang
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsLightbend
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...confluent
 
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017Michael Noll
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Michael Noll
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsLightbend
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...confluent
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changesconfluent
 
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRKafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRconfluent
 
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...Paul Brebner
 
Real-world Streaming Architectures
Real-world Streaming ArchitecturesReal-world Streaming Architectures
Real-world Streaming Architecturesconfluent
 
Event stream processing using Kafka streams
Event stream processing using Kafka streamsEvent stream processing using Kafka streams
Event stream processing using Kafka streamsFredrik Vraalsen
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformconfluent
 
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaExploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaLightbend
 
Follow the (Kafka) Streams
Follow the (Kafka) StreamsFollow the (Kafka) Streams
Follow the (Kafka) Streamsconfluent
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Erik Onnen
 
Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Gwen (Chen) Shapira
 

Was ist angesagt? (20)

Apache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream ProcessingApache Kafka, and the Rise of Stream Processing
Apache Kafka, and the Rise of Stream Processing
 
Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive Streams
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
Deep Dive Into Kafka Streams (and the Distributed Stream Processing Engine) (...
 
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
 
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka Streams
 
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
UDF/UDAF: the extensibility framework for KSQL (Hojjat Jafapour, Confluent) K...
 
Capture the Streams of Database Changes
Capture the Streams of Database ChangesCapture the Streams of Database Changes
Capture the Streams of Database Changes
 
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VRKafka Summit NYC 2017 Hanging Out with Your Past Self in VR
Kafka Summit NYC 2017 Hanging Out with Your Past Self in VR
 
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
ApacheCon Berlin 2019: Kongo:Building a Scalable Streaming IoT Application us...
 
Real-world Streaming Architectures
Real-world Streaming ArchitecturesReal-world Streaming Architectures
Real-world Streaming Architectures
 
Event stream processing using Kafka streams
Event stream processing using Kafka streamsEvent stream processing using Kafka streams
Event stream processing using Kafka streams
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Apache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platformApache kafka-a distributed streaming platform
Apache kafka-a distributed streaming platform
 
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaExploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
 
Follow the (Kafka) Streams
Follow the (Kafka) StreamsFollow the (Kafka) Streams
Follow the (Kafka) Streams
 
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
 
Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016Kafka connect-london-meetup-2016
Kafka connect-london-meetup-2016
 

Ähnlich wie Introducing Kafka Streams, the new stream processing library of Apache Kafka, Berlin Buzzwords 2016

kash.py - How to Make Your Data Scientists Love Real-time with Ralph M. Debus...
kash.py - How to Make Your Data Scientists Love Real-time with Ralph M. Debus...kash.py - How to Make Your Data Scientists Love Real-time with Ralph M. Debus...
kash.py - How to Make Your Data Scientists Love Real-time with Ralph M. Debus...HostedbyConfluent
 
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, BakdataStream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, BakdataHostedbyConfluent
 
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...HostedbyConfluent
 
Advanced Spark and Tensorflow Meetup - London - Nov 15, 2016 - Deploy Spark M...
Advanced Spark and Tensorflow Meetup - London - Nov 15, 2016 - Deploy Spark M...Advanced Spark and Tensorflow Meetup - London - Nov 15, 2016 - Deploy Spark M...
Advanced Spark and Tensorflow Meetup - London - Nov 15, 2016 - Deploy Spark M...Chris Fregly
 
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...HostedbyConfluent
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Chris Fregly
 
Brendan O'Bra Scala By the Schuykill
Brendan O'Bra Scala By the SchuykillBrendan O'Bra Scala By the Schuykill
Brendan O'Bra Scala By the SchuykillBrendan O'Bra
 
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020confluent
 
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...Big Data Spain
 
Apache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once SemanticsApache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once SemanticsYoshiyasu SAEKI
 
The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019Karthik Murugesan
 

Ähnlich wie Introducing Kafka Streams, the new stream processing library of Apache Kafka, Berlin Buzzwords 2016 (13)

kash.py - How to Make Your Data Scientists Love Real-time with Ralph M. Debus...
kash.py - How to Make Your Data Scientists Love Real-time with Ralph M. Debus...kash.py - How to Make Your Data Scientists Love Real-time with Ralph M. Debus...
kash.py - How to Make Your Data Scientists Love Real-time with Ralph M. Debus...
 
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, BakdataStream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
Stream Data Deduplication Powered by Kafka Streams | Philipp Schirmer, Bakdata
 
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
Failing to Cross the Streams – Lessons Learned the Hard Way | Philip Schmitt,...
 
Advanced Spark and Tensorflow Meetup - London - Nov 15, 2016 - Deploy Spark M...
Advanced Spark and Tensorflow Meetup - London - Nov 15, 2016 - Deploy Spark M...Advanced Spark and Tensorflow Meetup - London - Nov 15, 2016 - Deploy Spark M...
Advanced Spark and Tensorflow Meetup - London - Nov 15, 2016 - Deploy Spark M...
 
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
Building Event Streaming Microservices with Spring Boot and Apache Kafka | Ja...
 
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
Kafka Summit SF Apr 26 2016 - Generating Real-time Recommendations with NiFi,...
 
Spark streaming + kafka 0.10
Spark streaming + kafka 0.10Spark streaming + kafka 0.10
Spark streaming + kafka 0.10
 
Brendan O'Bra Scala By the Schuykill
Brendan O'Bra Scala By the SchuykillBrendan O'Bra Scala By the Schuykill
Brendan O'Bra Scala By the Schuykill
 
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
Welcome to Kafka; We’re Glad You’re Here (Dave Klein, Centene) Kafka Summit 2020
 
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
Spark Streaming + Kafka 0.10: an integration story by Joan Viladrosa Riera at...
 
Tutorial Kafka-Storm
Tutorial Kafka-StormTutorial Kafka-Storm
Tutorial Kafka-Storm
 
Apache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once SemanticsApache Kafka 0.11 の Exactly Once Semantics
Apache Kafka 0.11 の Exactly Once Semantics
 
The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019
 

Kürzlich hochgeladen

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 

Kürzlich hochgeladen (20)

RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 

Introducing Kafka Streams, the new stream processing library of Apache Kafka, Berlin Buzzwords 2016

  • 1. 1Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Introducing Kafka Streams Michael Noll michael@confluent.io Berlin Buzzwords, June 06, 2016 miguno
  • 2. 2Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 3. 3Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 4. 4Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 5. 5Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 6. 6Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 7. 7Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Stream processing in Real Life™ …before Kafka Streams …somewhat exaggerated …but perhaps not that much
  • 8. 8Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016Abandon all hope, ye who enter here.
  • 9. 9Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 How did this… (#machines == 1) scala> val input = (1 to 6).toSeq // Stateless computation scala> val doubled = input.map(_ * 2) Vector(2, 4, 6, 8, 10, 12) // Stateful computation scala> val sumOfOdds = input.filter(_ % 2 != 0).reduceLeft(_ + _) res2: Int = 9
  • 10. 10Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 …turn into stuff like this (#machines > 1)
  • 11. 11Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 12. 12Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 13. 13Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 14. 14Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016Taken at a session at ApacheCon: Big Data, Hungary, September 2015
  • 15. 15Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams stream processing made simple
  • 16. 16Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams • Powerful yet easy-to use Java library • Part of open source Apache Kafka, introduced in v0.10, May 2016 • Source code: https://github.com/apache/kafka/tree/trunk/streams • Build your own stream processing applications that are • highly scalable • fault-tolerant • distributed • stateful • able to handle late-arriving, out-of-order data • <more on this later>
  • 17. 17Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams
  • 18. 18Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams
  • 19. 19Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 What is Kafka Streams: Unix analogy $ cat < in.txt | grep “apache” | tr a-z A-Z > out.txt Kafka Kafka Connect Kafka Streams
  • 20. 20Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 What is Kafka Streams: Java analogy
  • 21. 21Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 When to use Kafka Streams (as of Kafka 0.10) Recommended use cases • Application Development • “Fast Data” apps (small or big data) • Reactive and stateful applications • Linear streams • Event-driven systems • Continuous transformations • Continuous queries • Microservices Questionable use cases • Data Science / Data Engineering • “Heavy lifting” • Data mining • Non-linear, branching streams (graphs) • Machine learning, number crunching • What you’d do in a data warehouse
  • 22. 22Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Alright, can you show me some code now?  KStream<Integer, Integer> input = builder.stream(“numbers-topic”); // Stateless computation KStream<Integer, Integer> doubled = input.mapValues(v -> v * 2); // Stateful computation KTable<Integer, Integer> sumOfOdds = input .filter((k,v) -> v % 2 != 0) .selectKey((k, v) -> 1) .reduceByKey((v1, v2) -> v1 + v2, ”sum-of-odds"); • API option 1: Kafka Streams DSL (declarative)
  • 23. 23Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Alright, can you show me some code now?  Startup Process a record Periodic action Shutdown • API option 2: low-level Processor API (imperative)
  • 24. 24Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 API != everything API, coding Operations, debugging, … “Full stack” evaluation
  • 25. 25Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 API != everything API, coding Operations, debugging, … “Full stack” evaluation
  • 26. 26Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams outsources hard problems to Kafka
  • 27. 27Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 How do I install Kafka Streams? <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>0.10.0.0</version> </dependency> • There is and there should be no “install”. • It’s a library. Add it to your app like any other library.
  • 28. 28Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 29. 29Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Do I need to install a CLUSTER to run my apps? • No, you don’t. Kafka Streams allows you to stay lean and lightweight. • Unlearn bad habits: “do cool stuff with data != must have cluster” Ok. Ok. Ok. Ok.
  • 30. 30Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 How do I package and deploy my apps? How do I …?
  • 31. 31Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016
  • 32. 32Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 How do I package and deploy my apps? How do I …? • Whatever works for you. Stick to what you/your company think is the best way. • Why? Because an app that uses Kafka Streams is…a normal Java app. • Your Ops/SRE/InfoSec teams may finally start to love not hate you.
  • 33. 33Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka concepts
  • 34. 34Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka concepts
  • 35. 35Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka concepts
  • 36. 36Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams concepts
  • 37. 37Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Stream
  • 38. 38Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Processor topology
  • 39. 39Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Stream partitions and stream tasks
  • 40. 40Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simpl http://docs.confluent.io/3.0.0/streams/concepts.html#duality-of-streams-and-tables
  • 41. 41Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL alice 2 bob 10 alice 3 time “Alice clicked 2 times.” “Alice clicked 2 times.”KTable = interprets data as changelog stream ~ is a continuously updated materialized view KStream = interprets data as record stream
  • 42. 42Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL alice 2 bob 10 alice 3 time “Alice clicked 2+3 = 5 times.” “Alice clicked 2 3 times.”KTable = interprets data as changelog stream ~ is a continuously updated materialized view KStream = interprets data as record stream
  • 43. 43Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL alice eggs bob lettuce alice milk alice paris bob zurich alice berlin KStream KTable time “Alice bought eggs.” “Alice is currently in Paris.” User purchase records User profile/location information
  • 44. 44Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL alice eggs bob lettuce alice milk alice paris bob zurich alice berlin KStream KTable User purchase records User profile/location information time “Alice bought eggs and milk.” “Alice is currently in Paris Berlin.” Show me ALL values for a key Show me the LATEST value for a key
  • 45. 45Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL
  • 46. 46Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL • JOIN example: compute user clicks by region via KStream.leftJoin(KTable)
  • 47. 47Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL • JOIN example: compute user clicks by region via KStream.leftJoin(KTable) Even simpler in Scala because, unlike Java, it natively supports tuples:
  • 48. 48Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL • JOIN example: compute user clicks by region via KStream.leftJoin(KTable) alice 13 bob 5Input KStream alice (europe, 13) bob (europe, 5)leftJoin() w/ KTable KStream 13europe 5europemap() KStream KTablereduceByKey(_ + _) 13europe …… …… 18europe …… ……
  • 49. 49Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL
  • 50. 50Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Streams meet Tables – in the Kafka Streams DSL
  • 51. 51Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams key features
  • 52. 52Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Key features in 0.10 • Native, 100%-compatible Kafka integration • Also inherits Kafka’s security model, e.g. to encrypt data-in-transit • Uses Kafka as its internal messaging layer, too
  • 53. 53Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Native Kafka integration • Reading data from Kafka • Writing data to Kafka
  • 54. 54Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Native Kafka integration • You can configure both Kafka Streams plus the underlying Kafka clients
  • 55. 55Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Key features in 0.10 • Native, 100%-compatible Kafka integration • Also inherits Kafka’s security model, e.g. to encrypt data-in-transit • Uses Kafka as its internal messaging layer, too • Highly scalable • Fault-tolerant • Elastic
  • 56. 56Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Execution model
  • 57. 57Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Execution model
  • 58. 58Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Execution model
  • 59. 59Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Execution model
  • 60. 60Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams outsources hard problems to Kafka
  • 61. 61Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Key features in 0.10 • Native, 100%-compatible Kafka integration • Also inherits Kafka’s security model, e.g. to encrypt data-in-transit • Uses Kafka as its internal messaging layer, too • Highly scalable • Fault-tolerant • Elastic • Stateful and stateless computations (e.g. joins, aggregations)
  • 62. 62Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Stateful computations • Stateful computations like aggregations or joins require state • We already showed a join example in the previous slides. • Windowing a stream is stateful, too, but let’s ignore this for now. • State stores in Kafka Streams • Typically: key-value stores • Pluggable implementation: RocksDB (default), in-memory, your own … • State stores are per stream task for isolation (think: share-nothing) • State stores are local for best performance • State stores are replicated to Kafka for elasticity and for fault-tolerance
  • 63. 63Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Execution model State stores (This is a bit simplified.)
  • 64. 64Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Remember?
  • 65. 65Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Execution model State stores (This is a bit simplified.) charlie 3 bob 1 alice 1 alice 2
  • 66. 66Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Execution model State stores (This is a bit simplified.)
  • 67. 67Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Execution model State stores (This is a bit simplified.) alice 1 alice 2
  • 68. 68Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Kafka Streams outsources hard problems to Kafka
  • 69. 69Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Stateful computations • Kafka Streams DSL: abstracts state stores away from you • Stateful operations include • count(), reduceByKey(), aggregate(), … • Low-level Processor API: direct access to state stores • Very flexible but more manual work for you
  • 70. 70Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Stateful computations • Use the low-level Processor API to interact directly with state stores Get the store Use the store
  • 71. 71Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Key features in 0.10 • Native, 100%-compatible Kafka integration • Also inherits Kafka’s security model, e.g. to encrypt data-in-transit • Uses Kafka as its internal messaging layer, too • Highly scalable • Fault-tolerant • Elastic • Stateful and stateless computations • Time model
  • 72. 72Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Time
  • 73. 73Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Time
  • 74. 74Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Time • You configure the desired time semantics through timestamp extractors • Default extractor yields event-time semantics • Extracts embedded timestamps of Kafka messages (introduced in v0.10)
  • 75. 75Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Key features in 0.10 • Native, 100%-compatible Kafka integration • Also inherits Kafka’s security model, e.g. to encrypt data-in-transit • Uses Kafka as its internal messaging layer, too • Highly scalable • Fault-tolerant • Elastic • Stateful and stateless computations • Time model • Windowing
  • 76. 76Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Windowing TimeWindows.of(3000) TimeWindows.of(3000).advanceBy(1000)
  • 77. 77Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Windowing use case: monitoring (1m/5m/15m averages) Confluent Control Center for Kafka
  • 78. 78Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Key features in 0.10 • Native, 100%-compatible Kafka integration • Also inherits Kafka’s security model, e.g. to encrypt data-in-transit • Uses Kafka as its internal messaging layer, too • Highly scalable • Fault-tolerant • Elastic • Stateful and stateless computations • Time model • Windowing • Supports late-arriving and out-of-order data • Millisecond processing latency, no micro-batching • At-least-once processing guarantees (exactly-once is in the works)
  • 79. 79Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Wrapping up
  • 80. 80Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Where to go from here? • Kafka Streams is available in Apache Kafka 0.10 and Confluent Platform 3.0 • http://kafka.apache.org/ • http://www.confluent.io/download (free + enterprise versions, tar/zip/deb/rpm) • Kafka Streams demos at https://github.com/confluentinc/examples • Java 7, Java 8+ with lambdas, and Scala • WordCount, Joins, Avro integration, Top-N computation, Windowing, … • Apache Kafka documentation: http://kafka.apache.org/documentation.html • Confluent documentation: http://docs.confluent.io/3.0.0/streams/ • Quickstart, Concepts, Architecture, Developer Guide, FAQ • Join our bi-weekly Ask Me Anything sessions on Kafka Streams • Contact me at michael@confluent.io for details
  • 81. 81Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Some of the things to come • Exactly-once semantics • Queriable state – tap into the state of your applications • SQL interface • Listen to and collaborate with the developer community • Your feedback counts a lot! Share it via users@kafka.apache.org Tomorrow’s keynote (09:30 AM) by Neha Narkhede, co-founder and CTO of Confluent “Application development and data in the emerging world of stream processing”
  • 82. 82Introducing Kafka Streams, Michael G. Noll, Berlin Buzzwords, June 2016 Want to contribute to Kafka and open source? Join the Kafka community http://kafka.apache.org/ Questions, comments? Tweet with #bbuzz and /cc to @ConfluentInc …in a great team with the creators of Kafka? Confluent is hiring  http://confluent.io/

Hinweis der Redaktion

  1. Let’s say you want to brew yourself a delicious coffee (= something of value to you).
  2. So when you’re at home, what do you need?
  3. First of all, you need water (data) because it is a key ingredient for coffee. So how do you get the water? Right – as a continuous stream through your water pipes (Kafka). Putting pipes in place typically requires some upfront work but it’s a crucial stepping stone that unlocks and enables many things – such as brewing coffee.
  4. But while having convenient access to the water through water pipes is necessary, it is not sufficient. You need something to transform the water into the tasty coffee.
  5. So we need a coffee machine (Kafka Streams). And this coffee machine should (1) plug right into your water pipes and (2) brew wonderful coffee without forcing you to take a 6-month barista class.
  6. One of the things you will realize: As a developer, you’ll be forced to become an Ops person, too, and you’ll need to become that very quickly.
  7. There’s a tremendous discrepancy between how we want to work and how we actually do work.
  8. When you are designing a new tool (like Kafka Streams) or when you are evaluating an existing one, you should ask yourself: "Is this tool part of the solution, or part of the problem?”
  9. What do we gain if we make complex things simple, easy, and fun? One gain is developer efficiency.
  10. Apache Kafka has made the streaming of data (= the transport and sharing of data) simple, easy, and fun. With Kafka Streams we set out to do the same for the *processing* of stream data.
  11. Basic workflow is: Get the input data into Kafka, e.g. via Kafka Connect (part of Apache Kafka) or via your own applications that write data to Kafka. Process the data with Kafka Streams, and write the results back to Kafka.
  12. Basic workflow is: Get the input data into Kafka, e.g. via Kafka Connect (part of Apache Kafka) or via your own applications that write data to Kafka. Process the data with Kafka Streams, and write the results back to Kafka.
  13. FYI: Some people have begun using the low-level Processor API to port their Apache Storm code to Kafka Streams.
  14. If you ARE running at large scale, what is the cost for doing so (notably operations)? If you ARE NOT running at large scale, what is the tool’s minimum investment? Is the tool cost-effective for small to medium scale, too? Same questions if you set out to design and build a tool.
  15. If you ARE running at large scale, what is the cost for doing so (notably operations)? If you ARE NOT running at large scale, what is the tool’s minimum investment? Is the tool cost-effective for small to medium scale, too? Same questions if you set out to design and build a tool.
  16. In some sense, Kafka Streams exploits economies of scale. Projects like Mesos or Kubernetes, for example, are totally focused on resource management and scheduling, and they’ll always do a better job here than a tool that’s focused on stream processing. Kafka Streams should rather allow for “composition” with these deployment tools and resources managers (think: Unix philosophy) rather than being strongly opinionated and dictating any such choices upon you.
  17. Stream: ordered, re-playable, fault-tolerant sequence of immutable data records Data records: key-value pairs (closely related to Kafka’s key-value messages)
  18. Processor topology: computational logic of an app’s data processing Defined via the Kafka Streams DSL or the low-level Processor API Stream processor: a node in the topology, represents a processing step As a user, you will only be exposed to these nuts n’ bolts if you use the Processor API. The Kafka Streams DSL hides this from you.
  19. Stream partitions and stream tasks are the logical units of parallelism Stream partition: totally ordered sequence of data records; maps to a Kafka topic A data record in the stream maps to a Kafka message from that topic The keys of data records determine the partitioning of data in both Kafka and Kafka Streams, i.e. how data is routed to specific partitions A processor topology is scaled by breaking it into multiple stream tasks, based on number of input stream partitions Stream tasks work in isolation, i.e. independent from each other Each Stream task has its own local, fault-tolerant state
  20. KStream ~ records are interpreted as INSERTs (since no record replaces any existing record) KTable – records are interpreted as UPDATEs (since any existing row with the same key is overwritten) Note: We ignore Bob in this diagram. Bob is only shown to highlight that there is generally more data than just Alice’s.
  21. KStream ~ records are interpreted as INSERTs (since no record replaces any existing record) KTable – records are interpreted as UPDATEs (since any existing row with the same key is overwritten) Note: We ignore Bob in this diagram. Bob is only shown to highlight that there is generally more data than just Alice’s.
  22. You can go back and forth between streams and tables. If you need to turn a stream into a table, you must specify the semantics of this transformation, for example when doing aggregations. These semantics can’t be second-guessed by a stream processing tool – for example, do you want to add or to multiply numbers in a stream? Only you know. In contrast, turning a table into a stream is straight-forward – we can simply iterate through every key/row in the table.
  23. To summarize: some operations such as map() retain the shape (e.g. a stream will stay a stream), some operations change the shape (e.g. a stream will become a table).
  24. A notable aspect of KTable is that it is being continuously updated “behind the scenes” as new data arrives to its input topic. This means that the join operations will automatically be using the latest available data for the KTable. Careful: Keep note on how the time flows in this diagram!
  25. A proper notion and model of time is crucial for stream processing