SlideShare ist ein Scribd-Unternehmen logo
1 von 42
Downloaden Sie, um offline zu lesen
KAFKA & KAFKA STREAMS
A FUNCTIONAL
ARCHITECTURE
KEVIN MAS RUIZ & ALEXEY GRAVANOV
Kevin Mas Ruiz
Thoughtworker
Alexey Gravanov
AutoScoutie
WHO WE ARE?
WHAT TO EXPECT?
● To meet ScoutWorks :)
● Tales about business requirements
● A brief introduction to some Kafka & Kafka Streams conventions
● See how we designed our architecture
● Talk about resilience in a functional architecture
AUTOSCOUT24
● Platform for selling cars & motorbikes
● 8 countries + 10 language versions
● 55+ thousands dealers
● 2,4+ millions listings
● 3+ billions page impression per month
● 10+ millions active users per month
OUR DOMAIN
● Core of domain are listings
● Images are one of the main point of information of listings
● Dealers want to export those listings to other marketplaces
OUR PRODUCT
A system able to export dealers’ high quality listings
to other marketplaces to improve her visibility on the market.
BUSINESS REQUIREMENTS
● A dealer is capable of enabling and disabling the export process
● All active listings of a dealer will be exported
● Exported listings that become inactive or deleted should be hidden
on external marketplaces
MORE BUSINESS REQUIREMENTS
● It’s acceptable to not have latest listing information exported in real-time,
but it should be eventually updated
● It’s important to have all listings on external marketplaces ASAP to ensure
visibility
● Listings data format is dynamic, so it should be possible to reprocess the
listing and export again
TECH REQUIREMENTS
● Load fluctuates during the day, scaling up / down is mandatory
● Easy to add additional marketplaces
● Easy to monitor / trace any listing
DATA FLOW
KAFKA
WHAT IS KAFKA?
● Distributed streaming platform
● Records are published in topics, which formed by partitions
● Each partition is an append-only (*) structured commit log
● Records consist of partition key, a value and a timestamp, and an assigned
offset, which means position of record in the log
KAFKA GUARANTEES
● Sharding of records based on partition key
● Replication of records depending on configuration
● Ordering of records within partition
● At-least-once delivery guarantee of records
WHY KAFKA?
Kafka is often used for building real-time streaming applications
that transform or react to the streams of data.
WHY KAFKA?
● Listings change propagation fits very well to Kafka streaming mindset
● Possibility to go back in time and reprocess records if needed
● Enables developers to design thinking in a composition of small functions
KAFKA STREAMS
● Opinionated library to process streams or records
● Provides possibility to build elastic, scalable and fault-tolerant solutions
● Uses Kafka to store current offsets / intermediate state of processed data
● Supports stateless processing, stateful processing or windowing
operations, e.g. aggregates of records
● For stateless operations, allows to see microservices as state-ignorant
pure functions, letting Kafka Streams to take care of side-effects
KAFKA STREAMS GUARANTEES
STREAMING VS MESSAGING
● Very similar approaches, but...
● Who has the fish?
● Go back in time and re-process records?
● Ordered records for a single aggregate root
MODELING WITH FUNCTIONS
Functions run once and
completely, can not be
interrupted
Atomic Composable
Functions can be chained
generating more abstract
and business-related
algebras
State-ignorant
State is shared as a
parameter, avoiding mutable
state between functions
FUNCTIONS ARE
CONSISTENCY BOUNDARIES
● Can only be ensured on a single partition
● Is degraded when repartitioning
AGGREGATE ROOT
● Is the boundary of consistency
● Is a set of records in a single topic with the same partition key
● Represents a single business object (for example, a Listing)
TOPOLOGY
TOPOLOGY
TOPOLOGY
Functions are based on an iterative
business language, not on size
WHAT ABOUT FAULT-TOLERANCE?
"Everything fails all the time."
Werner Vogels
VP & CTO at Amazon.com
KAFKA
For every topic with replication factor of N,
Kafka tolerates failures up to N-1 nodes.
KAFKA STREAMS
● One node setup: after coming back, picking up where processing stopped
● Multi-node setup: other nodes taking over, but…
○ Stateless processor: continue working as soon as nodes are re-balanced
○ Stateful processor, simple setup: can take a while until state is built up
○ Stateful processor, hot stand-by setup: local state is being build-up, but records are
not being actually processed until failover happens
LEARNINGS
● Function signature should be unique (only one function should be
responsible of a single transformation)
● Functions, by design, should not pertain to a single domain, but
map two domains
● The consistency boundary is a partition (or a single aggregate root)
LEARNINGS
● A system can be seen as a composition of functions, but data needs
to be managed by an external system.
● As a function, we should test transformations, not side-effects.
● Adding a correlation id on data sources is really useful for tracing, but
boundaries should be chosen carefully.
LEARNINGS
● Kafka Streams should not be used for external I/O. For example, if
you need a service that makes HTTP requests, use another streaming
engine for that (we used Akka Streams).
● Kafka Streams’ learning curve is really steep.
● Kafka Streams and Kafka by default are not there yet for medium size
messages (like ~50KB). You will need to tweak and optimize the
configuration.
LEARNINGS
● Backpressure is a natural fit as functions are pull-based.
● Single-direction data-flow is a mindset that needs to be learned and
improved.
THANK YOU
For questions or suggestions:
Kevin Mas Ruiz (@skmruiz)
kmas@ThoughtWorks.com
Alexey Gravanov (@gravanov)
alexey.gravanov@scout24.com

Weitere ähnliche Inhalte

Was ist angesagt?

Mumbai MuleSoft Meetup #15
Mumbai MuleSoft Meetup #15Mumbai MuleSoft Meetup #15
Mumbai MuleSoft Meetup #15Akshata Sawant
 
Thorben Lindhauer: Live Coding: Zeebe - Camunda Day San Francisco
Thorben Lindhauer: Live Coding: Zeebe - Camunda Day San FranciscoThorben Lindhauer: Live Coding: Zeebe - Camunda Day San Francisco
Thorben Lindhauer: Live Coding: Zeebe - Camunda Day San Franciscocamunda services GmbH
 
Salesforce Einstein API Integration with MuleSoft - NLP and Computer Vision
Salesforce Einstein API Integration with MuleSoft - NLP and Computer VisionSalesforce Einstein API Integration with MuleSoft - NLP and Computer Vision
Salesforce Einstein API Integration with MuleSoft - NLP and Computer VisionAnoop Ramachandran
 
Metrics driven development with dedicated Observability Team
Metrics driven development with dedicated Observability TeamMetrics driven development with dedicated Observability Team
Metrics driven development with dedicated Observability TeamLINE Corporation
 
The Rules of Network Automation - Interop/NYC 2014
The Rules of Network Automation - Interop/NYC 2014The Rules of Network Automation - Interop/NYC 2014
The Rules of Network Automation - Interop/NYC 2014Jeremy Schulman
 
Mumbai MuleSoft Meetup #19 - Anypoint monitoring and MQ Integrations
Mumbai MuleSoft Meetup #19 - Anypoint monitoring and MQ IntegrationsMumbai MuleSoft Meetup #19 - Anypoint monitoring and MQ Integrations
Mumbai MuleSoft Meetup #19 - Anypoint monitoring and MQ IntegrationsAkshata Sawant
 
Clovaを支える技術 機械学習配信基盤のご紹介
Clovaを支える技術 機械学習配信基盤のご紹介Clovaを支える技術 機械学習配信基盤のご紹介
Clovaを支える技術 機械学習配信基盤のご紹介LINE Corporation
 
The what, why and how of knative
The what, why and how of knativeThe what, why and how of knative
The what, why and how of knativeMofizur Rahman
 
ASAS 2013 - Space-based architecture: Linear scalability? High throughput? Lo...
ASAS 2013 - Space-based architecture: Linear scalability? High throughput? Lo...ASAS 2013 - Space-based architecture: Linear scalability? High throughput? Lo...
ASAS 2013 - Space-based architecture: Linear scalability? High throughput? Lo...Avisi B.V.
 
Trailblazer Rails Architecture
Trailblazer Rails ArchitectureTrailblazer Rails Architecture
Trailblazer Rails Architectureiqbal hasnan
 
The good, the bad, the ugly side of step functions
The good, the bad, the ugly side of step functionsThe good, the bad, the ugly side of step functions
The good, the bad, the ugly side of step functionsMohsiur Rahman
 
The FN Project by Maximilian Jerg
The FN Project by Maximilian JergThe FN Project by Maximilian Jerg
The FN Project by Maximilian JergHarald Schmaldienst
 
Devops online training ppt
Devops online training pptDevops online training ppt
Devops online training pptKhalidQureshi31
 
The Many Faces of Apache Kafka: Leveraging Real-time Data at Scale
The Many Faces of Apache Kafka: Leveraging Real-time Data at ScaleThe Many Faces of Apache Kafka: Leveraging Real-time Data at Scale
The Many Faces of Apache Kafka: Leveraging Real-time Data at ScaleMessaging Meetup
 
Apache Beam: Lote portátil y procesamiento de transmisión
Apache Beam: Lote portátil y procesamiento de transmisiónApache Beam: Lote portátil y procesamiento de transmisión
Apache Beam: Lote portátil y procesamiento de transmisiónGlobant
 
Pivotal tracker getting started
Pivotal tracker getting startedPivotal tracker getting started
Pivotal tracker getting startedAhmed Amer
 

Was ist angesagt? (20)

Mumbai MuleSoft Meetup #15
Mumbai MuleSoft Meetup #15Mumbai MuleSoft Meetup #15
Mumbai MuleSoft Meetup #15
 
Thorben Lindhauer: Live Coding: Zeebe - Camunda Day San Francisco
Thorben Lindhauer: Live Coding: Zeebe - Camunda Day San FranciscoThorben Lindhauer: Live Coding: Zeebe - Camunda Day San Francisco
Thorben Lindhauer: Live Coding: Zeebe - Camunda Day San Francisco
 
Salesforce Einstein API Integration with MuleSoft - NLP and Computer Vision
Salesforce Einstein API Integration with MuleSoft - NLP and Computer VisionSalesforce Einstein API Integration with MuleSoft - NLP and Computer Vision
Salesforce Einstein API Integration with MuleSoft - NLP and Computer Vision
 
Metrics driven development with dedicated Observability Team
Metrics driven development with dedicated Observability TeamMetrics driven development with dedicated Observability Team
Metrics driven development with dedicated Observability Team
 
The Rules of Network Automation - Interop/NYC 2014
The Rules of Network Automation - Interop/NYC 2014The Rules of Network Automation - Interop/NYC 2014
The Rules of Network Automation - Interop/NYC 2014
 
Mumbai MuleSoft Meetup #19 - Anypoint monitoring and MQ Integrations
Mumbai MuleSoft Meetup #19 - Anypoint monitoring and MQ IntegrationsMumbai MuleSoft Meetup #19 - Anypoint monitoring and MQ Integrations
Mumbai MuleSoft Meetup #19 - Anypoint monitoring and MQ Integrations
 
Clovaを支える技術 機械学習配信基盤のご紹介
Clovaを支える技術 機械学習配信基盤のご紹介Clovaを支える技術 機械学習配信基盤のご紹介
Clovaを支える技術 機械学習配信基盤のご紹介
 
Key alias dev standard final
Key alias   dev standard finalKey alias   dev standard final
Key alias dev standard final
 
The what, why and how of knative
The what, why and how of knativeThe what, why and how of knative
The what, why and how of knative
 
ASAS 2013 - Space-based architecture: Linear scalability? High throughput? Lo...
ASAS 2013 - Space-based architecture: Linear scalability? High throughput? Lo...ASAS 2013 - Space-based architecture: Linear scalability? High throughput? Lo...
ASAS 2013 - Space-based architecture: Linear scalability? High throughput? Lo...
 
Trailblazer Rails Architecture
Trailblazer Rails ArchitectureTrailblazer Rails Architecture
Trailblazer Rails Architecture
 
The good, the bad, the ugly side of step functions
The good, the bad, the ugly side of step functionsThe good, the bad, the ugly side of step functions
The good, the bad, the ugly side of step functions
 
The FN Project by Maximilian Jerg
The FN Project by Maximilian JergThe FN Project by Maximilian Jerg
The FN Project by Maximilian Jerg
 
Cloud hub - Overview
Cloud hub - OverviewCloud hub - Overview
Cloud hub - Overview
 
Devops online training ppt
Devops online training pptDevops online training ppt
Devops online training ppt
 
The Many Faces of Apache Kafka: Leveraging Real-time Data at Scale
The Many Faces of Apache Kafka: Leveraging Real-time Data at ScaleThe Many Faces of Apache Kafka: Leveraging Real-time Data at Scale
The Many Faces of Apache Kafka: Leveraging Real-time Data at Scale
 
Apache Beam: Lote portátil y procesamiento de transmisión
Apache Beam: Lote portátil y procesamiento de transmisiónApache Beam: Lote portátil y procesamiento de transmisión
Apache Beam: Lote portátil y procesamiento de transmisión
 
Scmp P & F
Scmp P & FScmp P & F
Scmp P & F
 
Pivotal tracker getting started
Pivotal tracker getting startedPivotal tracker getting started
Pivotal tracker getting started
 
#1 MuleSoft Meetup in Geneva
#1 MuleSoft Meetup in Geneva #1 MuleSoft Meetup in Geneva
#1 MuleSoft Meetup in Geneva
 

Ähnlich wie A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Ruiz & Alexey Gravanov (joint in Munich & Barcelona only)

Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using KafkaKnoldus Inc.
 
Scalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERScalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERShuyi Chen
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafkaconfluent
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®confluent
 
Laskar: High-Velocity GraphQL & Lambda-based Software Development Model
Laskar: High-Velocity GraphQL & Lambda-based Software Development ModelLaskar: High-Velocity GraphQL & Lambda-based Software Development Model
Laskar: High-Velocity GraphQL & Lambda-based Software Development ModelGarindra Prahandono
 
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka Hua Chu
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uberconfluent
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerFederico Palladoro
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016Monal Daxini
 
It's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureIt's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureYaroslav Tkachenko
 
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyIt's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyHostedbyConfluent
 
Cassandra Lunch #88: Cadence
Cassandra Lunch #88: CadenceCassandra Lunch #88: Cadence
Cassandra Lunch #88: CadenceAnant Corporation
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataApache Apex
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterContinuent
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Monal Daxini
 
Interactive workflow management using Azkaban
Interactive workflow management using AzkabanInteractive workflow management using Azkaban
Interactive workflow management using Azkabandatamantra
 
Stories from running Kafka on K8S.pdf
Stories from running Kafka on K8S.pdfStories from running Kafka on K8S.pdf
Stories from running Kafka on K8S.pdfAvinashUpadhyaya3
 
Data Engineer's Lunch #44: Prefect
Data Engineer's Lunch #44: PrefectData Engineer's Lunch #44: Prefect
Data Engineer's Lunch #44: PrefectAnant Corporation
 

Ähnlich wie A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Ruiz & Alexey Gravanov (joint in Munich & Barcelona only) (20)

Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Apache Kafka Streams
Apache Kafka StreamsApache Kafka Streams
Apache Kafka Streams
 
Scalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBERScalable complex event processing on samza @UBER
Scalable complex event processing on samza @UBER
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafka
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®Tips & Tricks for Apache Kafka®
Tips & Tricks for Apache Kafka®
 
Laskar: High-Velocity GraphQL & Lambda-based Software Development Model
Laskar: High-Velocity GraphQL & Lambda-based Software Development ModelLaskar: High-Velocity GraphQL & Lambda-based Software Development Model
Laskar: High-Velocity GraphQL & Lambda-based Software Development Model
 
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
PyCon HK 2018 - Heterogeneous job processing with Apache Kafka
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
 
Big data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on dockerBig data Argentina meetup 2020-09: Intro to presto on docker
Big data Argentina meetup 2020-09: Intro to presto on docker
 
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016Netflix keystone   streaming data pipeline @scale in the cloud-dbtb-2016
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
 
It's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda ArchitectureIt's Time To Stop Using Lambda Architecture
It's Time To Stop Using Lambda Architecture
 
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyIt's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
 
Cassandra Lunch #88: Cadence
Cassandra Lunch #88: CadenceCassandra Lunch #88: Cadence
Cassandra Lunch #88: Cadence
 
Intro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big DataIntro to Apache Apex @ Women in Big Data
Intro to Apache Apex @ Women in Big Data
 
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera ClusterWebinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #2: Galera Cluster
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
 
Interactive workflow management using Azkaban
Interactive workflow management using AzkabanInteractive workflow management using Azkaban
Interactive workflow management using Azkaban
 
Stories from running Kafka on K8S.pdf
Stories from running Kafka on K8S.pdfStories from running Kafka on K8S.pdf
Stories from running Kafka on K8S.pdf
 
Data Engineer's Lunch #44: Prefect
Data Engineer's Lunch #44: PrefectData Engineer's Lunch #44: Prefect
Data Engineer's Lunch #44: Prefect
 

Mehr von Thoughtworks

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a ProductThoughtworks
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & DogsThoughtworks
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovationThoughtworks
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teamsThoughtworks
 
Culture of Innovation
Culture of InnovationCulture of Innovation
Culture of InnovationThoughtworks
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer ExperienceThoughtworks
 
When we design together
When we design togetherWhen we design together
When we design togetherThoughtworks
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)Thoughtworks
 
Customer-centric innovation enabled by cloud
 Customer-centric innovation enabled by cloud Customer-centric innovation enabled by cloud
Customer-centric innovation enabled by cloudThoughtworks
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of InnovationThoughtworks
 
When in doubt, go live
When in doubt, go liveWhen in doubt, go live
When in doubt, go liveThoughtworks
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the RubiconThoughtworks
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!Thoughtworks
 
Docker container security
Docker container securityDocker container security
Docker container securityThoughtworks
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unitThoughtworks
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Thoughtworks
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to TuringThoughtworks
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked outThoughtworks
 

Mehr von Thoughtworks (20)

Design System as a Product
Design System as a ProductDesign System as a Product
Design System as a Product
 
Designers, Developers & Dogs
Designers, Developers & DogsDesigners, Developers & Dogs
Designers, Developers & Dogs
 
Cloud-first for fast innovation
Cloud-first for fast innovationCloud-first for fast innovation
Cloud-first for fast innovation
 
More impact with flexible teams
More impact with flexible teamsMore impact with flexible teams
More impact with flexible teams
 
Culture of Innovation
Culture of InnovationCulture of Innovation
Culture of Innovation
 
Dual-Track Agile
Dual-Track AgileDual-Track Agile
Dual-Track Agile
 
Developer Experience
Developer ExperienceDeveloper Experience
Developer Experience
 
When we design together
When we design togetherWhen we design together
When we design together
 
Hardware is hard(er)
Hardware is hard(er)Hardware is hard(er)
Hardware is hard(er)
 
Customer-centric innovation enabled by cloud
 Customer-centric innovation enabled by cloud Customer-centric innovation enabled by cloud
Customer-centric innovation enabled by cloud
 
Amazon's Culture of Innovation
Amazon's Culture of InnovationAmazon's Culture of Innovation
Amazon's Culture of Innovation
 
When in doubt, go live
When in doubt, go liveWhen in doubt, go live
When in doubt, go live
 
Don't cross the Rubicon
Don't cross the RubiconDon't cross the Rubicon
Don't cross the Rubicon
 
Error handling
Error handlingError handling
Error handling
 
Your test coverage is a lie!
Your test coverage is a lie!Your test coverage is a lie!
Your test coverage is a lie!
 
Docker container security
Docker container securityDocker container security
Docker container security
 
Redefining the unit
Redefining the unitRedefining the unit
Redefining the unit
 
Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22Technology Radar Webinar UK - Vol. 22
Technology Radar Webinar UK - Vol. 22
 
A Tribute to Turing
A Tribute to TuringA Tribute to Turing
A Tribute to Turing
 
Rsa maths worked out
Rsa maths worked outRsa maths worked out
Rsa maths worked out
 

Kürzlich hochgeladen

Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2
 
tonesoftg
tonesoftgtonesoftg
tonesoftglanshi9
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Bert Jan Schrijver
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 

Kürzlich hochgeladen (20)

Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 

A Functional Approach to Architecture - Kafka & Kafka Streams - Kevin Mas Ruiz & Alexey Gravanov (joint in Munich & Barcelona only)

  • 1. KAFKA & KAFKA STREAMS A FUNCTIONAL ARCHITECTURE KEVIN MAS RUIZ & ALEXEY GRAVANOV
  • 2. Kevin Mas Ruiz Thoughtworker Alexey Gravanov AutoScoutie WHO WE ARE?
  • 3. WHAT TO EXPECT? ● To meet ScoutWorks :) ● Tales about business requirements ● A brief introduction to some Kafka & Kafka Streams conventions ● See how we designed our architecture ● Talk about resilience in a functional architecture
  • 4. AUTOSCOUT24 ● Platform for selling cars & motorbikes ● 8 countries + 10 language versions ● 55+ thousands dealers ● 2,4+ millions listings ● 3+ billions page impression per month ● 10+ millions active users per month
  • 5. OUR DOMAIN ● Core of domain are listings ● Images are one of the main point of information of listings ● Dealers want to export those listings to other marketplaces
  • 6. OUR PRODUCT A system able to export dealers’ high quality listings to other marketplaces to improve her visibility on the market.
  • 7. BUSINESS REQUIREMENTS ● A dealer is capable of enabling and disabling the export process ● All active listings of a dealer will be exported ● Exported listings that become inactive or deleted should be hidden on external marketplaces
  • 8. MORE BUSINESS REQUIREMENTS ● It’s acceptable to not have latest listing information exported in real-time, but it should be eventually updated ● It’s important to have all listings on external marketplaces ASAP to ensure visibility ● Listings data format is dynamic, so it should be possible to reprocess the listing and export again
  • 9. TECH REQUIREMENTS ● Load fluctuates during the day, scaling up / down is mandatory ● Easy to add additional marketplaces ● Easy to monitor / trace any listing
  • 11. KAFKA
  • 12. WHAT IS KAFKA? ● Distributed streaming platform ● Records are published in topics, which formed by partitions ● Each partition is an append-only (*) structured commit log ● Records consist of partition key, a value and a timestamp, and an assigned offset, which means position of record in the log
  • 13. KAFKA GUARANTEES ● Sharding of records based on partition key ● Replication of records depending on configuration ● Ordering of records within partition ● At-least-once delivery guarantee of records
  • 14. WHY KAFKA? Kafka is often used for building real-time streaming applications that transform or react to the streams of data.
  • 15. WHY KAFKA? ● Listings change propagation fits very well to Kafka streaming mindset ● Possibility to go back in time and reprocess records if needed ● Enables developers to design thinking in a composition of small functions
  • 16. KAFKA STREAMS ● Opinionated library to process streams or records ● Provides possibility to build elastic, scalable and fault-tolerant solutions ● Uses Kafka to store current offsets / intermediate state of processed data ● Supports stateless processing, stateful processing or windowing operations, e.g. aggregates of records ● For stateless operations, allows to see microservices as state-ignorant pure functions, letting Kafka Streams to take care of side-effects
  • 18. STREAMING VS MESSAGING ● Very similar approaches, but... ● Who has the fish? ● Go back in time and re-process records? ● Ordered records for a single aggregate root
  • 20.
  • 21. Functions run once and completely, can not be interrupted Atomic Composable Functions can be chained generating more abstract and business-related algebras State-ignorant State is shared as a parameter, avoiding mutable state between functions FUNCTIONS ARE
  • 22. CONSISTENCY BOUNDARIES ● Can only be ensured on a single partition ● Is degraded when repartitioning
  • 23.
  • 24. AGGREGATE ROOT ● Is the boundary of consistency ● Is a set of records in a single topic with the same partition key ● Represents a single business object (for example, a Listing)
  • 25.
  • 26.
  • 27.
  • 29.
  • 31.
  • 33. Functions are based on an iterative business language, not on size
  • 35. "Everything fails all the time." Werner Vogels VP & CTO at Amazon.com
  • 36. KAFKA For every topic with replication factor of N, Kafka tolerates failures up to N-1 nodes.
  • 37. KAFKA STREAMS ● One node setup: after coming back, picking up where processing stopped ● Multi-node setup: other nodes taking over, but… ○ Stateless processor: continue working as soon as nodes are re-balanced ○ Stateful processor, simple setup: can take a while until state is built up ○ Stateful processor, hot stand-by setup: local state is being build-up, but records are not being actually processed until failover happens
  • 38. LEARNINGS ● Function signature should be unique (only one function should be responsible of a single transformation) ● Functions, by design, should not pertain to a single domain, but map two domains ● The consistency boundary is a partition (or a single aggregate root)
  • 39. LEARNINGS ● A system can be seen as a composition of functions, but data needs to be managed by an external system. ● As a function, we should test transformations, not side-effects. ● Adding a correlation id on data sources is really useful for tracing, but boundaries should be chosen carefully.
  • 40. LEARNINGS ● Kafka Streams should not be used for external I/O. For example, if you need a service that makes HTTP requests, use another streaming engine for that (we used Akka Streams). ● Kafka Streams’ learning curve is really steep. ● Kafka Streams and Kafka by default are not there yet for medium size messages (like ~50KB). You will need to tweak and optimize the configuration.
  • 41. LEARNINGS ● Backpressure is a natural fit as functions are pull-based. ● Single-direction data-flow is a mindset that needs to be learned and improved.
  • 42. THANK YOU For questions or suggestions: Kevin Mas Ruiz (@skmruiz) kmas@ThoughtWorks.com Alexey Gravanov (@gravanov) alexey.gravanov@scout24.com