SlideShare ist ein Scribd-Unternehmen logo
1 von 74
Downloaden Sie, um offline zu lesen
1
Using microservices for streaming data
Streaming
in Context…
lightbend.com/fast-data-platform
Free as in🍺
Streaming architectures (from the report)
Kubernetes, Mesos, YARN, …
Cloud or on-premise
Files
Sockets
REST
ZooKeeper Cluster
ZK
Mini-batch
Spark
Batch
Spark
…
Low Latency
Flink
Ka5a	Streams
Akka	Streams
Beam
Persistence
S3,	…
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 10
KaFa Cluster
Broker
2
4
7
8
9
Beam
Spark	
Events
Streams
Storage
Microservices
ReacBve	PlaEorm
Go Node.js …
Kubernetes, Mesos, YARN, …
Cloud or on-premise
Files
Sockets
REST
ZooKeeper Cluster
ZK
Mini-batch
Spark
Batch
Spark
…
Low Latency
Flink
Ka5a	Streams
Akka	Streams
Beam
Persistence
S3,	…
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
3 10
KaFa Cluster
Broker
2
4
7
8
9
Beam
Spark	
Events
Streams
Storage
Microservices
ReacBve	PlaEorm
Go Node.js …
Today’s focus:
•Kafka - the
data backplane
•Akka Streams
and Kafka
Streams -
streaming
microservices
Why Kafka?
Files
Sockets
REST
ZooKeeper Cluster
ZK
Ka5
Ak
S3,	…
HDFS
DiskDiskDisk
1
5
6
3 10
KaFa Cluster
Broker
2
4
7
8
Events
Streams
Microservices
ReacBve	PlaEorm
Go Node.js …Why Kafka?
Organized into
topics
Ka#a
Partition 1
Partition 2
Topic A
Partition 1Topic B
Topics are partitioned,
replicated, and
distributed
Unlike queues, consumers
don’t delete entries; Kafka
manages their lifecycles
M Producers
N Consumers,
who start
reading where
they want
Consumer
1
(at offset 14)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Partition 1Topic B
Producer
1 Producer
2
Consumer
2
(at offset 10)
writes
reads
Consumer
3
(at offset 6)
earliest latest
Logs,	not	queues!
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N * M links ConsumersProducers
Before:
Kafka for Connectivity
X
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N * M links ConsumersProducers
Before:
Kafka for Connectivity
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N + M links ConsumersProducers
After:
X X
Kafka for Connectivity
Service 1
Log &
Other Files
Internet
Services
Service 2
Service 3
Services
Services
N + M links ConsumersProducers
After:
• Simplify dependencies
• Resilient against data loss
• M producers, N consumers
• Simplicity of one “API” for
communication
Streaming
Engines
Files
Sockets
REST
ZooKeeper Cluster
ZK
Mini-batch
Spark
Batch
Spark
…
Low Latency
Flink
Ka5a	Streams
Akka	Streams
Beam
Persistence
S3,	…
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Broker
2
4
7
8
9
Beam
Spark	
Events
Streams
Storage
Microservices
Go Node.js …
Spark, Flink - services to which
you submit work. Large scale,
automatic data partitioning.
Beam - similar. Google’s project
that has been instrumental in
defining streaming semantics.
Streaming
Engines
Files
Sockets
REST
ZooKeeper Cluster
ZK
Mini-batch
Spark
Batch
Spark
…
Low Latency
Flink
Ka5a	Streams
Akka	Streams
Beam
Persistence
S3,	…
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Broker
2
4
7
8
9
Beam
Spark	
Events
Streams
Storage
Microservices
Go Node.js …
They do a lot (Spark example)
…NodeNode
Spark Driver
object MyApp {
def main() {
val ss =
new SparkSession(…)
…
}
}
Cluster
Manager
Spark Executor
task task
task task
Spark Executor
task task
task task
…
Files
Sockets
REST
ZooKeeper Cluster
ZK
Mini-batch
Spark
Batch
Spark
…
Low Latency
Flink
Ka5a	Streams
Akka	Streams
Beam
Persistence
S3,	…
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Broker
2
4
7
8
9
Beam
Spark	
Events
Streams
Storage
Microservices
Go Node.js …
They do a lot (Spark example)
Cluster
Node
input
Node Node Node
Time
filter
flatMap
join
map
Partition 1 Partition 2 Partition 3 Partition 4
Partition 1 Partition 2 Partition 3 Partition 4
Partition 1 Partition 2 Partition 3 Partition 4
Partition 1 Partition 2 Partition 3 Partition 4
Partition 1 Partition 2 Partition 3 Partition 4
…
stage1stage2
… … … …
Files
Sockets
REST
ZooKeeper Cluster
ZK
Mini-batch
Spark
Batch
Spark
…
Low Latency
Flink
Ka5a	Streams
Akka	Streams
Beam
Persistence
S3,	…
HDFS
DiskDiskDisk
SQL/
NoSQL
Search
1
5
6
KaFa Cluster
Broker
2
4
7
8
9
Beam
Spark	
Events
Streams
Storage
Microservices
Go Node.js …
Akka Streams, Kafka Streams -
libraries for “data-centric
microservices”. Smaller scale,
but great flexibility
Streaming
Engines
Microservice All the Things!!
“Record-centric” μ-services
Events Records
A Spectrum of Microservices
Event-driven μ-services
…
Browse
REST
AccountOrders
Shopping
Cart
API	Gateway
Inventory
storage
Data
Model
Training
Model
Serving
Other
Logic
A Spectrum of Microservices
…
Browse
REST
AccountOrders
Shopping
Cart
API	Gateway
Inventory
• Each item has a an identity
• Process each item uniquely
• Think sessions and state
machines
“Record-centric” μ-services
Events Records
A Spectrum of Microservices
Event-driven μ-services
…
Browse
REST
AccountOrders
Shopping
Cart
API	Gateway
Inventory
storage
Data
Model
Training
Model
Serving
Other
Logic
A Spectrum of Microservices
storage
Data
Model
Training
Model
Serving
Other
Logic
• Anonymous items
• Process en masse
• Think SQL queries
“Record-centric” μ-services
Events Records
A Spectrum of Microservices
Event-driven μ-services
…
Browse
REST
AccountOrders
Shopping
Cart
API	Gateway
Inventory
storage
Data
Model
Training
Model
Serving
Other
Logic
Events Records
A Spectrum of Microservices
Event-driven μ-services
…
Browse
REST
AccountOrders
Shopping
Cart
API	Gateway
Inventory
Akka emerged from the left-hand
side of the spectrum, the world
of highly Reactive microservices.
Akka Streams pushes to the
right, more data-centric.
“Record-centric” μ-services
Events Records
A Spectrum of Microservices
storage
Data
Model
Training
Model
Serving
Other
Logic
Emerged from the right-hand
side.
Kafka Streams pushes to the
left, supporting many event-
processing scenarios.
Kafka Streams
• Important stream-processing semantics, e.g.,
• Distinguish between event time and processing time
Kafka Streams
0
Time (minutes)
1 2 3 …
Analysis
Server 1
Server 2
accumulate
1 1
2 2 2 2 2 2
1 1
2 2
1 1 1
…
Key
Collect data,
Then process
accumulate
n
Event at Server n
propagated to
Analysis
• Important stream-processing semantics, e.g.,
• Windowing support (e.g., group by within a window)
Kafka Streams
0
Time (minutes)
1 2 3 …
Analysis
Server 1
Server 2
accumulate
1 1
2 2 2 2 2 2
1 1
2 2
1 1 1
…
Key
Collect data,
Then process
accumulate
n
Event at Server n
propagated to
Analysis
See	my	O’Reilly	
report	for	details.
• Exactly once
• vs.
• At most once - “fire and forget”
• At least once - keep trying until acknowledged
(but you have to handle de-duplication)
• Note: it’s really effectively once
Kafka Streams
• KStream
• Per record transformations, one to one
mapping
• KTable
• Last value per key
• Useful for holding state values
Kafka Streams
Consumer
1
(at offset 14)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Partition 1Topic B
Producer
1 Producer
2
Consumer
2
(at offset 10)
writes
reads
Consumer
3
(at offset 6)
earliest latest
Ka#a
KTable A KTable B
• Read from and write to Kafka topics, memory
• Kafka Connect used for other sources and sinks
• Load balance and scale using topic partitioning
Kafka Streams
• Java API
• Scala API: written by Lightbend
• SQL!!
Kafka Streams
Kafka Streams Example
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Scored
Records
Final
Records
storage
Data
Model
Training
Model
Serving
Other
Logic
Kafka Streams Example
We’ll use the new Scala Kafka Streams API:
• Developed by Debasish Ghosh, Boris Lublinsky, Sean Glover,
et al. from Lightbend
• See also the following, if you know about queryable state:
• https://github.com/lightbend/kafka-streams-query
See github.com/lightbend/
kafka-with-akka-streams-kafka-streams-tutorial
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
val builder = new StreamsBuilderS // New Scala Wrapper API.
val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic)
val model = builder.stream[Array[Byte], Array[Byte]](modelTopic)
val modelProcessor = new ModelProcessor
val scorer = new Scorer(modelProcessor) // scorer.score(record) used
model.mapValues(bytes => Model.parseBytes(bytes)) // array => record
   .filter((key, model) => model.valid) // Successful?
.mapValues(model => ModelImpl.findModel(model))
   .process(() => modelProcessor, …) // Set up actual model
data.mapValues(bytes => DataRecord.parseBytes(bytes))
   .filter((key, record) => record.valid)
.mapValues(record => new ScoredRecord(scorer.score(record),record))
   .to(scoredRecordsTopic)
val streams = new KafkaStreams(
builder.build, streamsConfiguration)
streams.start()
sys.addShutdownHook(streams.close())
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Scored
Records
What’s Missing?
Kafka Streams is a powerful library, but you’ll need to provide
the rest of the microservice support telling through other
means. (Some are provided if you run the support services for
the SQL interface.)
You would embed your KS code in microservices written with
more comprehensive toolkits, such as the Lightbend Reactive
Platform!
We’ll return to this point…
Akka Streams
Akka Streams
• Implements Reactive Streams
• http://www.reactive-streams.org/
• Back pressure for flow control
Akka Streams
Event
Event
Event
Event
Event
Event
Event/Data
Stream
Consumer
Consumer
bounded queue
back
pressure
back
pressure
back
pressure
Akka Streams
Event/Data
Stream
Consumer
Consumer
Back pressure composes!
Akka Streams
• Part of the Akka ecosystem
• Akka Actors, Akka Cluster, Akka HTTP, Akka
Persistence, …
• Alpakka - rich connection library
• like Camel, but implements Reactive Streams
• Very low overhead and latency
Akka Streams
• The “gist”:
val source: Source[Int, NotUsed] = Source(1 to 10)
val factorials = source.scan(BigInt(1)) (
(total, next) => total * next )
factorials.runWith(Sink.foreach(println))
val source: Source[Int, NotUsed] = Source(1 to 10)
val factorials = source.scan(BigInt(1)) (
(total, next) => total * next )
factorials.runWith(Sink.foreach(println))
Akka Streams
• The “gist”:
Source Flow Sink
A “Graph”
Akka Streams Example
Akka Cluster
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Final
Records
Alpakka
Alpakka
storage
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Scored
Records
Final
Records
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor // Same as KS example
val scorer = new Scorer(modelProcessor) // Same as KS example
val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage”
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor // Same as KS example
val scorer = new Scorer(modelProcessor) // Same as KS example
val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage”
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor // Same as KS example
val scorer = new Scorer(modelProcessor) // Same as KS example
val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage”
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor // Same as KS example
val scorer = new Scorer(modelProcessor) // Same as KS example
val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage”
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor // Same as KS example
val scorer = new Scorer(modelProcessor) // Same as KS example
val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage”
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor // Same as KS example
val scorer = new Scorer(modelProcessor) // Same as KS example
val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage”
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor // Same as KS example
val scorer = new Scorer(modelProcessor) // Same as KS example
val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage”
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
case class ModelScoringStage(scorer: …) extends
GraphStageWithMaterializedValue[…, …] {
val dataRecordIn = Inlet[Record]("dataRecordIn")
val modelRecordIn = Inlet[ModelImpl]("modelRecordIn")
val scoringResultOut = Outlet[ScoredRecord]("scoringOut")
…
setHandler(dataRecordIn, new InHandler {
override def onPush(): Unit = {
val record = grab(dataRecordIn)
val newRecord = new ScoredRecord(scorer.score(record),record))
push(scoringResultOut, Some(newRecord))
pull(dataRecordIn)
})
…
}
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
implicit val system = ActorSystem("ModelServing")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val modelProcessor = new ModelProcessor // Same as KS example
val scorer = new Scorer(modelProcessor) // Same as KS example
val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage”
val dataStream: Source[Record, Consumer.Control] =
Consumer.atMostOnceSource(dataConsumerSettings,
Subscriptions.topics(rawDataTopic))
.map(input => DataRecord.parseBytes(input.value()))
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
.collect{ case Success(modImpl) => modImpl }
.foreach(modImpl => modelProcessor.setModel(modImpl))
modelStream.to(Sink.ignore).run() // No “sinking” required; just run
dataStream
.viaMat(modelScoringStage)(Keep.right)
.map(result => new ProducerRecord[Array[Byte], ScoredRecord](
scoredRecordsTopic, result))
.runWith(Producer.plainSink(producerSettings))
Akka Cluster
Data
Model
Training
Model
Serving
Raw
Data
Model
Params
Alpakka
.collect{ case Success(data) => data }
val modelStream: Source[ModelImpl, Consumer.Control] =
Consumer.atMostOnceSource(modelConsumerSettings,
Subscriptions.topics(modelTopic))
.map(input => Model.parseBytes(input.value()))
.collect{ case Success(mod) => mod }
.map(model => ModelImpl.findModel(model))
.collect{ case Success(modImpl) => modImpl }
.foreach(modImpl => modelProcessor.setModel(modImpl))
modelStream.to(Sink.ignore).run() // No “sinking” required; just run
dataStream
.viaMat(modelScoringStage)(Keep.right)
.map(result => new ProducerRecord[Array[Byte], ScoredRecord](
scoredRecordsTopic, result))
.runWith(Producer.plainSink(producerSettings))
Other
Concerns
•Scale scoring with workers and
routers, across a cluster
•Persist actor state with Akka
Persistence
•Connect to almost anything with
Alpakka
•StreamRefs for distributed
streams (new!)
Akka Cluster
Model Serving
Other
Logic
Alpakka
Alpakka
Router
Worker
Worker
Worker
Worker
Stateful
Logic
Persistence
actor	state	
storage
Akka Cluster
Data
Model
Training
Model
Serving
Other
Logic
Raw
Data
Model
Params
Final
Records
Alpakka
Alpakka
storage
•Extremely low latency
•Minimal I/O and memory
overhead. No marshaling
overhead
•Hard to scale, evolve
independently
Go Direct or Through Kafka?
Akka Cluster
Model
Serving
Other
Logic
Alpakka
Alpakka
Model
Serving
Other
Logic
Scored
Records
vs. ?
•Higher latency (network,
queue depth)
•Higher I/O and processing
(marshaling) overhead
•Easy independent scalability,
evolution
Go Direct or Through Kafka?
Akka Cluster
Model
Serving
Other
Logic
Alpakka
Alpakka
Model
Serving
Other
Logic
Scored
Records
vs. ?
•Reactive Streams back
pressure
•“Direct” coupling between
sender and receiver, but
actually uses a URL abstraction
•Very deep buffer (partition
limited by disk size)
•Strong decoupling - M
producers, N consumers,
completely disconnected
Wrapping Up…
Free as in🍺
lightbend.com/fast-data-platform
lightbend.com/fast-data-platform
dean.wampler@lightbend.com
lightbend.com/fast-data-platform
polyglotprogramming.com/talks
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsLightbend
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsLightbend
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsLightbend
 
A Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In ProductionA Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In ProductionLightbend
 
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OSPutting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OSLightbend
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreLegacy Typesafe (now Lightbend)
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationPatrick Di Loreto
 
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaExploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaLightbend
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Exampleconfluent
 
Akka and Kubernetes: Reactive From Code To Cloud
Akka and Kubernetes: Reactive From Code To CloudAkka and Kubernetes: Reactive From Code To Cloud
Akka and Kubernetes: Reactive From Code To CloudLightbend
 
Lightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams APIconfluent
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaHelena Edelson
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesLightbend
 
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion DubaiSMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion DubaiCodemotion Dubai
 
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...Lightbend
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsSlim Baltagi
 
Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker seriesMonal Daxini
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaJen Aman
 

Was ist angesagt? (20)

Revitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive StreamsRevitalizing Enterprise Integration with Reactive Streams
Revitalizing Enterprise Integration with Reactive Streams
 
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed ApplicationsAkka at Enterprise Scale: Performance Tuning Distributed Applications
Akka at Enterprise Scale: Performance Tuning Distributed Applications
 
Streaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka StreamsStreaming Microservices With Akka Streams And Kafka Streams
Streaming Microservices With Akka Streams And Kafka Streams
 
A Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In ProductionA Tale of Two APIs: Using Spark Streaming In Production
A Tale of Two APIs: Using Spark Streaming In Production
 
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OSPutting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
 
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and moreTypesafe Reactive Platform: Monitoring 1.0, Commercial features and more
Typesafe Reactive Platform: Monitoring 1.0, Commercial features and more
 
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time PersonalizationUsing Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
Using Spark, Kafka, Cassandra and Akka on Mesos for Real-Time Personalization
 
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache KafkaExploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
Exploring Reactive Integrations With Akka Streams, Alpakka And Apache Kafka
 
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
 
Akka and Kubernetes: Reactive From Code To Cloud
Akka and Kubernetes: Reactive From Code To CloudAkka and Kubernetes: Reactive From Code To Cloud
Akka and Kubernetes: Reactive From Code To Cloud
 
Lightbend Fast Data Platform
Lightbend Fast Data PlatformLightbend Fast Data Platform
Lightbend Fast Data Platform
 
Introducing Kafka's Streams API
Introducing Kafka's Streams APIIntroducing Kafka's Streams API
Introducing Kafka's Streams API
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
 
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on KubernetesDetecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
 
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion DubaiSMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
 
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Str...
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Flink at netflix paypal speaker series
Flink at netflix   paypal speaker seriesFlink at netflix   paypal speaker series
Flink at netflix paypal speaker series
 
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei ZahariaDeep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
Deep Learning and Streaming in Apache Spark 2.x with Matei Zaharia
 

Ähnlich wie Akka Streams And Kafka Streams: Where Microservices Meet Fast Data

Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Matt Stubbs
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?Eventador
 
What is apache Kafka?
What is apache Kafka?What is apache Kafka?
What is apache Kafka?Kenny Gorman
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductEvans Ye
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLNick Dearden
 
Streaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
Streaming with Spring Cloud Stream and Apache Kafka - Soby ChackoStreaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
Streaming with Spring Cloud Stream and Apache Kafka - Soby ChackoVMware Tanzu
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Databricks
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?confluent
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedEdureka!
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kai Wähner
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Helena Edelson
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Data Con LA
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Chris Fregly
 
Spark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-fullSpark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-fullJim Dowling
 
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark Summit
 
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikKeeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikHostedbyConfluent
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comCedric Vidal
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesAmazon Web Services
 

Ähnlich wie Akka Streams And Kafka Streams: Where Microservices Meet Fast Data (20)

Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
Big Data LDN 2018: STREAMING DATA MICROSERVICES WITH AKKA STREAMS, KAFKA STRE...
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
 
What is apache Kafka?
What is apache Kafka?What is apache Kafka?
What is apache Kafka?
 
Using the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data ProductUsing the SDACK Architecture to Build a Big Data Product
Using the SDACK Architecture to Build a Big Data Product
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Streaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQLStreaming ETL with Apache Kafka and KSQL
Streaming ETL with Apache Kafka and KSQL
 
Streaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
Streaming with Spring Cloud Stream and Apache Kafka - Soby ChackoStreaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
Streaming with Spring Cloud Stream and Apache Kafka - Soby Chacko
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
 
Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)Kafka Connect and Streams (Concepts, Architecture, Features)
Kafka Connect and Streams (Concepts, Architecture, Features)
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
Kinesis and Spark Streaming - Advanced AWS Meetup - August 2014
 
Spark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-fullSpark summit-east-dowling-feb2017-full
Spark summit-east-dowling-feb2017-full
 
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
Spark-Streaming-as-a-Service with Kafka and YARN: Spark Summit East talk by J...
 
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikKeeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
 

Mehr von Lightbend

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaLightbend
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterLightbend
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsLightbend
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesLightbend
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesLightbend
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessLightbend
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondLightbend
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Lightbend
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lightbend
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...Lightbend
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightLightbend
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In PracticeLightbend
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryLightbend
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowLightbend
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsLightbend
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsLightbend
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldLightbend
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaLightbend
 
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On KubernetesHow To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On KubernetesLightbend
 
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And KubernetesA Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And KubernetesLightbend
 

Mehr von Lightbend (20)

IoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with AkkaIoT 'Megaservices' - High Throughput Microservices with Akka
IoT 'Megaservices' - High Throughput Microservices with Akka
 
How Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a ClusterHow Akka Cluster Works: Actors Living in a Cluster
How Akka Cluster Works: Actors Living in a Cluster
 
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native ApplicationsThe Reactive Principles: Eight Tenets For Building Cloud Native Applications
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
 
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka MicroservicesPutting the 'I' in IoT - Building Digital Twins with Akka Microservices
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
 
Digital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and MicroservicesDigital Transformation with Kubernetes, Containers, and Microservices
Digital Transformation with Kubernetes, Containers, and Microservices
 
Cloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful ServerlessCloudstate - Towards Stateful Serverless
Cloudstate - Towards Stateful Serverless
 
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and BeyondDigital Transformation from Monoliths to Microservices to Serverless and Beyond
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
 
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
 
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
 
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
 
Microservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done RightMicroservices, Kubernetes, and Application Modernization Done Right
Microservices, Kubernetes, and Application Modernization Done Right
 
Full Stack Reactive In Practice
Full Stack Reactive In PracticeFull Stack Reactive In Practice
Full Stack Reactive In Practice
 
Akka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love StoryAkka and Kubernetes: A Symbiotic Love Story
Akka and Kubernetes: A Symbiotic Love Story
 
Scala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To KnowScala 3 Is Coming: Martin Odersky Shares What To Know
Scala 3 Is Coming: Martin Odersky Shares What To Know
 
Migrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive SystemsMigrating From Java EE To Cloud-Native Reactive Systems
Migrating From Java EE To Cloud-Native Reactive Systems
 
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming ApplicationsRunning Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
 
Designing Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native WorldDesigning Events-First Microservices For A Cloud Native World
Designing Events-First Microservices For A Cloud Native World
 
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For ScalaScala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
 
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On KubernetesHow To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
 
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And KubernetesA Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes
 

Kürzlich hochgeladen

Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 

Kürzlich hochgeladen (20)

Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaSWSO2CON 2024 Slides - Open Source to SaaS
WSO2CON 2024 Slides - Open Source to SaaS
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 

Akka Streams And Kafka Streams: Where Microservices Meet Fast Data

  • 1. 1
  • 2. Using microservices for streaming data
  • 6. Kubernetes, Mesos, YARN, … Cloud or on-premise Files Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Batch Spark … Low Latency Flink Ka5a Streams Akka Streams Beam Persistence S3, … HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 10 KaFa Cluster Broker 2 4 7 8 9 Beam Spark Events Streams Storage Microservices ReacBve PlaEorm Go Node.js …
  • 7. Kubernetes, Mesos, YARN, … Cloud or on-premise Files Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Batch Spark … Low Latency Flink Ka5a Streams Akka Streams Beam Persistence S3, … HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 3 10 KaFa Cluster Broker 2 4 7 8 9 Beam Spark Events Streams Storage Microservices ReacBve PlaEorm Go Node.js … Today’s focus: •Kafka - the data backplane •Akka Streams and Kafka Streams - streaming microservices
  • 9. Files Sockets REST ZooKeeper Cluster ZK Ka5 Ak S3, … HDFS DiskDiskDisk 1 5 6 3 10 KaFa Cluster Broker 2 4 7 8 Events Streams Microservices ReacBve PlaEorm Go Node.js …Why Kafka?
  • 10. Organized into topics Ka#a Partition 1 Partition 2 Topic A Partition 1Topic B Topics are partitioned, replicated, and distributed
  • 11. Unlike queues, consumers don’t delete entries; Kafka manages their lifecycles M Producers N Consumers, who start reading where they want Consumer 1 (at offset 14) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Partition 1Topic B Producer 1 Producer 2 Consumer 2 (at offset 10) writes reads Consumer 3 (at offset 6) earliest latest Logs, not queues!
  • 12. Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N * M links ConsumersProducers Before: Kafka for Connectivity X
  • 13. Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N * M links ConsumersProducers Before: Kafka for Connectivity Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N + M links ConsumersProducers After: X X
  • 14. Kafka for Connectivity Service 1 Log & Other Files Internet Services Service 2 Service 3 Services Services N + M links ConsumersProducers After: • Simplify dependencies • Resilient against data loss • M producers, N consumers • Simplicity of one “API” for communication
  • 16. Files Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Batch Spark … Low Latency Flink Ka5a Streams Akka Streams Beam Persistence S3, … HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Broker 2 4 7 8 9 Beam Spark Events Streams Storage Microservices Go Node.js … Spark, Flink - services to which you submit work. Large scale, automatic data partitioning. Beam - similar. Google’s project that has been instrumental in defining streaming semantics. Streaming Engines
  • 17. Files Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Batch Spark … Low Latency Flink Ka5a Streams Akka Streams Beam Persistence S3, … HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Broker 2 4 7 8 9 Beam Spark Events Streams Storage Microservices Go Node.js … They do a lot (Spark example) …NodeNode Spark Driver object MyApp { def main() { val ss = new SparkSession(…) … } } Cluster Manager Spark Executor task task task task Spark Executor task task task task …
  • 18. Files Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Batch Spark … Low Latency Flink Ka5a Streams Akka Streams Beam Persistence S3, … HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Broker 2 4 7 8 9 Beam Spark Events Streams Storage Microservices Go Node.js … They do a lot (Spark example) Cluster Node input Node Node Node Time filter flatMap join map Partition 1 Partition 2 Partition 3 Partition 4 Partition 1 Partition 2 Partition 3 Partition 4 Partition 1 Partition 2 Partition 3 Partition 4 Partition 1 Partition 2 Partition 3 Partition 4 Partition 1 Partition 2 Partition 3 Partition 4 … stage1stage2 … … … …
  • 19. Files Sockets REST ZooKeeper Cluster ZK Mini-batch Spark Batch Spark … Low Latency Flink Ka5a Streams Akka Streams Beam Persistence S3, … HDFS DiskDiskDisk SQL/ NoSQL Search 1 5 6 KaFa Cluster Broker 2 4 7 8 9 Beam Spark Events Streams Storage Microservices Go Node.js … Akka Streams, Kafka Streams - libraries for “data-centric microservices”. Smaller scale, but great flexibility Streaming Engines
  • 21. “Record-centric” μ-services Events Records A Spectrum of Microservices Event-driven μ-services … Browse REST AccountOrders Shopping Cart API Gateway Inventory storage Data Model Training Model Serving Other Logic
  • 22. A Spectrum of Microservices … Browse REST AccountOrders Shopping Cart API Gateway Inventory • Each item has a an identity • Process each item uniquely • Think sessions and state machines
  • 23. “Record-centric” μ-services Events Records A Spectrum of Microservices Event-driven μ-services … Browse REST AccountOrders Shopping Cart API Gateway Inventory storage Data Model Training Model Serving Other Logic
  • 24. A Spectrum of Microservices storage Data Model Training Model Serving Other Logic • Anonymous items • Process en masse • Think SQL queries
  • 25. “Record-centric” μ-services Events Records A Spectrum of Microservices Event-driven μ-services … Browse REST AccountOrders Shopping Cart API Gateway Inventory storage Data Model Training Model Serving Other Logic
  • 26. Events Records A Spectrum of Microservices Event-driven μ-services … Browse REST AccountOrders Shopping Cart API Gateway Inventory Akka emerged from the left-hand side of the spectrum, the world of highly Reactive microservices. Akka Streams pushes to the right, more data-centric.
  • 27. “Record-centric” μ-services Events Records A Spectrum of Microservices storage Data Model Training Model Serving Other Logic Emerged from the right-hand side. Kafka Streams pushes to the left, supporting many event- processing scenarios.
  • 29. • Important stream-processing semantics, e.g., • Distinguish between event time and processing time Kafka Streams 0 Time (minutes) 1 2 3 … Analysis Server 1 Server 2 accumulate 1 1 2 2 2 2 2 2 1 1 2 2 1 1 1 … Key Collect data, Then process accumulate n Event at Server n propagated to Analysis
  • 30. • Important stream-processing semantics, e.g., • Windowing support (e.g., group by within a window) Kafka Streams 0 Time (minutes) 1 2 3 … Analysis Server 1 Server 2 accumulate 1 1 2 2 2 2 2 2 1 1 2 2 1 1 1 … Key Collect data, Then process accumulate n Event at Server n propagated to Analysis See my O’Reilly report for details.
  • 31. • Exactly once • vs. • At most once - “fire and forget” • At least once - keep trying until acknowledged (but you have to handle de-duplication) • Note: it’s really effectively once Kafka Streams
  • 32. • KStream • Per record transformations, one to one mapping • KTable • Last value per key • Useful for holding state values Kafka Streams Consumer 1 (at offset 14) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Partition 1Topic B Producer 1 Producer 2 Consumer 2 (at offset 10) writes reads Consumer 3 (at offset 6) earliest latest Ka#a KTable A KTable B
  • 33. • Read from and write to Kafka topics, memory • Kafka Connect used for other sources and sinks • Load balance and scale using topic partitioning Kafka Streams
  • 34. • Java API • Scala API: written by Lightbend • SQL!! Kafka Streams
  • 36. Kafka Streams Example We’ll use the new Scala Kafka Streams API: • Developed by Debasish Ghosh, Boris Lublinsky, Sean Glover, et al. from Lightbend • See also the following, if you know about queryable state: • https://github.com/lightbend/kafka-streams-query
  • 38. Data Model Training Model Serving Raw Data Model Params Scored Records val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close())
  • 39. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 40. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 41. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 42. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 43. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 44. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 45. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 46. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 47. val builder = new StreamsBuilderS // New Scala Wrapper API. val data = builder.stream[Array[Byte], Array[Byte]](rawDataTopic) val model = builder.stream[Array[Byte], Array[Byte]](modelTopic) val modelProcessor = new ModelProcessor val scorer = new Scorer(modelProcessor) // scorer.score(record) used model.mapValues(bytes => Model.parseBytes(bytes)) // array => record    .filter((key, model) => model.valid) // Successful? .mapValues(model => ModelImpl.findModel(model))    .process(() => modelProcessor, …) // Set up actual model data.mapValues(bytes => DataRecord.parseBytes(bytes))    .filter((key, record) => record.valid) .mapValues(record => new ScoredRecord(scorer.score(record),record))    .to(scoredRecordsTopic) val streams = new KafkaStreams( builder.build, streamsConfiguration) streams.start() sys.addShutdownHook(streams.close()) Data Model Training Model Serving Raw Data Model Params Scored Records
  • 48. What’s Missing? Kafka Streams is a powerful library, but you’ll need to provide the rest of the microservice support telling through other means. (Some are provided if you run the support services for the SQL interface.) You would embed your KS code in microservices written with more comprehensive toolkits, such as the Lightbend Reactive Platform! We’ll return to this point…
  • 50. Akka Streams • Implements Reactive Streams • http://www.reactive-streams.org/ • Back pressure for flow control
  • 53. Akka Streams • Part of the Akka ecosystem • Akka Actors, Akka Cluster, Akka HTTP, Akka Persistence, … • Alpakka - rich connection library • like Camel, but implements Reactive Streams • Very low overhead and latency
  • 54. Akka Streams • The “gist”: val source: Source[Int, NotUsed] = Source(1 to 10) val factorials = source.scan(BigInt(1)) ( (total, next) => total * next ) factorials.runWith(Sink.foreach(println))
  • 55. val source: Source[Int, NotUsed] = Source(1 to 10) val factorials = source.scan(BigInt(1)) ( (total, next) => total * next ) factorials.runWith(Sink.foreach(println)) Akka Streams • The “gist”: Source Flow Sink A “Graph”
  • 56. Akka Streams Example Akka Cluster Data Model Training Model Serving Other Logic Raw Data Model Params Final Records Alpakka Alpakka storage Data Model Training Model Serving Other Logic Raw Data Model Params Scored Records Final Records
  • 57. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor // Same as KS example val scorer = new Scorer(modelProcessor) // Same as KS example val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage” val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model))
  • 58. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor // Same as KS example val scorer = new Scorer(modelProcessor) // Same as KS example val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage” val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model))
  • 59. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor // Same as KS example val scorer = new Scorer(modelProcessor) // Same as KS example val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage” val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model))
  • 60. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor // Same as KS example val scorer = new Scorer(modelProcessor) // Same as KS example val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage” val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model))
  • 61. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor // Same as KS example val scorer = new Scorer(modelProcessor) // Same as KS example val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage” val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model))
  • 62. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor // Same as KS example val scorer = new Scorer(modelProcessor) // Same as KS example val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage” val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model))
  • 63. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor // Same as KS example val scorer = new Scorer(modelProcessor) // Same as KS example val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage” val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model)) case class ModelScoringStage(scorer: …) extends GraphStageWithMaterializedValue[…, …] { val dataRecordIn = Inlet[Record]("dataRecordIn") val modelRecordIn = Inlet[ModelImpl]("modelRecordIn") val scoringResultOut = Outlet[ScoredRecord]("scoringOut") … setHandler(dataRecordIn, new InHandler { override def onPush(): Unit = { val record = grab(dataRecordIn) val newRecord = new ScoredRecord(scorer.score(record),record)) push(scoringResultOut, Some(newRecord)) pull(dataRecordIn) }) … }
  • 64. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka implicit val system = ActorSystem("ModelServing") implicit val materializer = ActorMaterializer() implicit val executionContext = system.dispatcher val modelProcessor = new ModelProcessor // Same as KS example val scorer = new Scorer(modelProcessor) // Same as KS example val modelScoringStage = new ModelScoringStage(scorer)// AS custom “stage” val dataStream: Source[Record, Consumer.Control] = Consumer.atMostOnceSource(dataConsumerSettings, Subscriptions.topics(rawDataTopic)) .map(input => DataRecord.parseBytes(input.value())) .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model))
  • 65. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model)) .collect{ case Success(modImpl) => modImpl } .foreach(modImpl => modelProcessor.setModel(modImpl)) modelStream.to(Sink.ignore).run() // No “sinking” required; just run dataStream .viaMat(modelScoringStage)(Keep.right) .map(result => new ProducerRecord[Array[Byte], ScoredRecord]( scoredRecordsTopic, result)) .runWith(Producer.plainSink(producerSettings))
  • 66. Akka Cluster Data Model Training Model Serving Raw Data Model Params Alpakka .collect{ case Success(data) => data } val modelStream: Source[ModelImpl, Consumer.Control] = Consumer.atMostOnceSource(modelConsumerSettings, Subscriptions.topics(modelTopic)) .map(input => Model.parseBytes(input.value())) .collect{ case Success(mod) => mod } .map(model => ModelImpl.findModel(model)) .collect{ case Success(modImpl) => modImpl } .foreach(modImpl => modelProcessor.setModel(modImpl)) modelStream.to(Sink.ignore).run() // No “sinking” required; just run dataStream .viaMat(modelScoringStage)(Keep.right) .map(result => new ProducerRecord[Array[Byte], ScoredRecord]( scoredRecordsTopic, result)) .runWith(Producer.plainSink(producerSettings))
  • 68. •Scale scoring with workers and routers, across a cluster •Persist actor state with Akka Persistence •Connect to almost anything with Alpakka •StreamRefs for distributed streams (new!) Akka Cluster Model Serving Other Logic Alpakka Alpakka Router Worker Worker Worker Worker Stateful Logic Persistence actor state storage Akka Cluster Data Model Training Model Serving Other Logic Raw Data Model Params Final Records Alpakka Alpakka storage
  • 69. •Extremely low latency •Minimal I/O and memory overhead. No marshaling overhead •Hard to scale, evolve independently Go Direct or Through Kafka? Akka Cluster Model Serving Other Logic Alpakka Alpakka Model Serving Other Logic Scored Records vs. ? •Higher latency (network, queue depth) •Higher I/O and processing (marshaling) overhead •Easy independent scalability, evolution
  • 70. Go Direct or Through Kafka? Akka Cluster Model Serving Other Logic Alpakka Alpakka Model Serving Other Logic Scored Records vs. ? •Reactive Streams back pressure •“Direct” coupling between sender and receiver, but actually uses a URL abstraction •Very deep buffer (partition limited by disk size) •Strong decoupling - M producers, N consumers, completely disconnected