A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Ruiz & Alexey Gravanov (joint in Munich & Barcelona only)

KAFKA & KAFKA STREAMS
A FUNCTIONAL
ARCHITECTURE
KEVIN MAS RUIZ & ALEXEY GRAVANOV

Kevin Mas Ruiz
Thoughtworker
Alexey Gravanov
AutoScoutie
WHO WE ARE?

WHAT TO EXPECT?
● To meet ScoutWorks :)
● Tales about business requirements
● A brief introduction to some Kafka & Kafka Streams conventions
● See how we designed our architecture
● Talk about resilience in a functional architecture

AUTOSCOUT24
● Platform for selling cars & motorbikes
● 8 countries + 10 language versions
● 55+ thousands dealers
● 2,4+ millions listings
● 3+ billions page impression per month
● 10+ millions active users per month

OUR DOMAIN
● Core of domain are listings
● Images are one of the main point of information of listings
● Dealers want to export those listings to other marketplaces

OUR PRODUCT
A system able to export dealers’ high quality listings
to other marketplaces to improve her visibility on the market.

BUSINESS REQUIREMENTS
● A dealer is capable of enabling and disabling the export process
● All active listings of a dealer will be exported
● Exported listings that become inactive or deleted should be hidden
on external marketplaces

MORE BUSINESS REQUIREMENTS
● It’s acceptable to not have latest listing information exported in real-time,
but it should be eventually updated
● It’s important to have all listings on external marketplaces ASAP to ensure
visibility
● Listings data format is dynamic, so it should be possible to reprocess the
listing and export again

TECH REQUIREMENTS
● Load fluctuates during the day, scaling up / down is mandatory
● Easy to add additional marketplaces
● Easy to monitor / trace any listing

WHAT IS KAFKA?
● Distributed streaming platform
● Records are published in topics, which formed by partitions
● Each partition is an append-only (*) structured commit log
● Records consist of partition key, a value and a timestamp, and an assigned
offset, which means position of record in the log

KAFKA GUARANTEES
● Sharding of records based on partition key
● Replication of records depending on configuration
● Ordering of records within partition
● At-least-once delivery guarantee of records

WHY KAFKA?
Kafka is often used for building real-time streaming applications
that transform or react to the streams of data.

WHY KAFKA?
● Listings change propagation fits very well to Kafka streaming mindset
● Possibility to go back in time and reprocess records if needed
● Enables developers to design thinking in a composition of small functions

KAFKA STREAMS
● Opinionated library to process streams or records
● Provides possibility to build elastic, scalable and fault-tolerant solutions
● Uses Kafka to store current offsets / intermediate state of processed data
● Supports stateless processing, stateful processing or windowing
operations, e.g. aggregates of records
● For stateless operations, allows to see microservices as state-ignorant
pure functions, letting Kafka Streams to take care of side-effects

STREAMING VS MESSAGING
● Very similar approaches, but...
● Who has the fish?
● Go back in time and re-process records?
● Ordered records for a single aggregate root

Functions run once and
completely, can not be
interrupted
Atomic Composable
Functions can be chained
generating more abstract
and business-related
algebras
State-ignorant
State is shared as a
parameter, avoiding mutable
state between functions
FUNCTIONS ARE

CONSISTENCY BOUNDARIES
● Can only be ensured on a single partition
● Is degraded when repartitioning

AGGREGATE ROOT
● Is the boundary of consistency
● Is a set of records in a single topic with the same partition key
● Represents a single business object (for example, a Listing)

Functions are based on an iterative
business language, not on size

"Everything fails all the time."
Werner Vogels
VP & CTO at Amazon.com

KAFKA
For every topic with replication factor of N,
Kafka tolerates failures up to N-1 nodes.

KAFKA STREAMS
● One node setup: after coming back, picking up where processing stopped
● Multi-node setup: other nodes taking over, but…
○ Stateless processor: continue working as soon as nodes are re-balanced
○ Stateful processor, simple setup: can take a while until state is built up
○ Stateful processor, hot stand-by setup: local state is being build-up, but records are
not being actually processed until failover happens

LEARNINGS
● Function signature should be unique (only one function should be
responsible of a single transformation)
● Functions, by design, should not pertain to a single domain, but
map two domains
● The consistency boundary is a partition (or a single aggregate root)

LEARNINGS
● A system can be seen as a composition of functions, but data needs
to be managed by an external system.
● As a function, we should test transformations, not side-effects.
● Adding a correlation id on data sources is really useful for tracing, but
boundaries should be chosen carefully.

LEARNINGS
● Kafka Streams should not be used for external I/O. For example, if
you need a service that makes HTTP requests, use another streaming
engine for that (we used Akka Streams).
● Kafka Streams’ learning curve is really steep.
● Kafka Streams and Kafka by default are not there yet for medium size
messages (like ~50KB). You will need to tweak and optimize the
configuration.

LEARNINGS
● Backpressure is a natural fit as functions are pull-based.
● Single-direction data-flow is a mindset that needs to be learned and
improved.

THANK YOU
For questions or suggestions:
Kevin Mas Ruiz (@skmruiz)
kmas@ThoughtWorks.com
Alexey Gravanov (@gravanov)
alexey.gravanov@scout24.com

A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Ruiz & Alexey Gravanov (joint in Munich & Barcelona only)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Ruiz & Alexey Gravanov (joint in Munich & Barcelona only)

Ähnlich wie A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Ruiz & Alexey Gravanov (joint in Munich & Barcelona only) (20)

Mehr von Thoughtworks

Mehr von Thoughtworks (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Ruiz & Alexey Gravanov (joint in Munich & Barcelona only)