SlideShare ist ein Scribd-Unternehmen logo
1 von 55
Downloaden Sie, um offline zu lesen
From Monolith to Microservices
Paul Puschmann
OSDC 2018, Berlin
Our history
3
Details REWE GROUP
Turnover
>54 bn
History
> 90 years
Employees
>330.000
Industries
Food Retail,
Tourism, DIY
Shops
>15.000
4
What do we actually run?
Our history at REWE Digital
● Have a good platform / software architecture
● Scale the application
● Enable fast delivery of features, accelerate the business
Main Goals
Our history at REWE Digital
2014
40 15
100
28
150
51
2015 2016 2017
# Services
# Teams
1 2
200
51
2018
This is an approximation...
● Take care of our “Managed-Hosting”
● Re-automate an already existing PROD-environment with Ansible
● Keep everything running
● Support our developers
● Pager-Duty
What did the Ops-people do?
● Integration of new features — difficult
● Deployments every two weeks — slow
● Deployments took eventually 1h — slow
● “Everything” in the monolith (plus databases) — dependency constraints
Status Quo of the Monolith
2014 / beginning of year 2015
Wishes of the stakeholders:
● Features, features, features
● Application must not break
Developers
● We want to code, test, deploy
Ops
● Really!?
Wishes
Beginning of year 2015
● New features aren’t built into the monolith anymore,
but as a separate applications
● There are strong guidelines regarding
- API
- Monitoring interfaces
- Logging
Beginning of 2015
The Plan
● While building new functionality in to micro-services,
existing features were extracted from the monolith (scoop out)
to allow faster, independent development of features
● This should remove all BL from the monolith soon
Continued…
The Plan
The pressure of having a perfectly working runtime-environment soon was quite high.
Pragmatic decision: poor-mans micro-services
● @devs: please package your app in a .deb-package
● we do the rest via Ansible and HAProxy
Bring this to life, then move on
Containers? Yes, but no.
How should we manage all those new applications?
Our solution Containers
Should we use the early versions of Kubernetes or Mesosphere Marathon?
No.
We wanted to have an environment
we were able to understand, automate and manage.
So we created a custom Docker-environment with Docker & Swarm.
Containers? Yes, but how?
How should we manage all those containers?
● Debian machines (VMs & Metal)
● Docker-CE
● Docker Swarm (not swarm-mode)
● Consul
● Consul-template
● Dnsmasq
● Nginx
● Deployment with Ansible
● Secrets managed by “Ops"
- sorry, no “Hashicorp Vault”, yet
Our solution consists of…
We say: No.
Because we created a solution we were capable to run and fits our needs.
“Best Practises” don’t work for everybody.
Only downsides:
• Docker swarm isn’t good at “deploy spread”
• We’ve no orchestration-service that ensures our containers are running fine
and in the right
Reinventing the wheel?
● Blue / Green deployment
● Semi-automatic configuration / deployment of new databases
● Fully automated set-up of Team-Jenkins instances (even at GCP)
● A custom-built platform-service called “Slash” as a service-interface for our
developers
● Use a common set of Docker base-images with daily builds
Additional goodies
Docker-Images
Automatic build-chains
with night builds for
our Docker
base-images and
dependent images.
Example:
Java-base images
All is fine now?
What about
● Monitoring —> Check.
● Responsibility —> ?
● THE MONOLITH?
The Monolith…
… is still in production.
Scaling Services
Our 45 teams are developing and running more than 200 services
Imagine if all of them talk to each other:
Scale at Servicelevel
Our 45 teams are developing and running more than 200 services
Imagine if all of them talk to each other:
Scale at Servicelevel
Our 45 teams are developing and running more than 200 services
Imagine if all of them talk to each other:
Scale at Servicelevel
Challenges in HTTP/REST-only architectures
Gateway
µService 1
µService 2
µService 5µService 4
Things that help:
● API-Guidelines
● Timeouts
● Fallbacks
● Circuit Breakers
● Eventing
µService 3
What is Eventing?
● Enable services to provide themselves with data asynchronously before it is needed in a request
● Kind of database replication
● More performance & stability
Service A
a
Service B Service C
ca cba’ c’
What is the goal of Eventing?
produce produceconsumeconsume
Representation of something that happened in the domain
(Evans)
● An event concerns:
○ one domain entity (e.g. “customer”, “shopping cart”, “order”, “delivery area”)
○ and one state change that happened to that entity (e.g. “customer registered”,
“item added to shopping cart”, “order fulfilled”, “delivery area expanded”)
● Event = functional object to describe domain changes
● Event = vehicle for database replication
What is a (domain-) event?
● ID: Unique identifier
● Key: Which entity is affected?
● Version: Which version of this entity is this?
● Time: When did the event occur?
● Type: What kind of action happened?
● Payload: What are the details?
○ Entire entity - not deltas!
{
“id” : “4ea55fbb7c887”,
“key” : “7ebc8eeb1f2f45”,
“version” : 1,
“time” : "2018-02-22T17:05:55Z",
“type” : “customer-registered”,
“payload” : {
“id” : “7ebc8eeb1f2f45”,
“version” : 1,
“first_name” : “Paul”,
“last_name” : “Puschmann”,
“e-mail” : “bofh(at)rewe-digital.com”
}
}
Technical Event
customer
data
customer topic
Customer
Data
Service
<<publish Event>>
Invoice
Service
customer
data’
<<subscribe Event>>
Loyalty
Service
customer
data’’
<<subscribe Event>>
. . .
“payload”: {
“customer_uuid” : ”876ef6e5”,
“version” : 3,
“name” : “Peter Smith”,
“loyalty_id” : “477183877”,
“invoice_address” : “752 High Street”,
“delivery_address” : “67 Liverpool Street”
}
“customer_uuid” : ”876ef6e5”,
“version” : 3,
“name” : “Peter Smith”,
“invoice_address” : “752 High Street”
“customer_uuid” : ”876ef6e5”,
“version” : 3,
“name” : “Peter Smith”,
“loyalty_id” : “477183877”
Example: Customer Data
We chose Apache Kafka
● Open-source stream processing platform written in Scala and
Java
● High-throughput, low-latency platform for real-time data
streams
● Originally developed at Linkedin, open sourced in 2011
● Offers 4 APIs: Producer, Consumer, Stream, Connect
● We use Apache Kafka in a pub-sub manner. This means most of
our services use the Producer and Consumer APIs
“Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant,
wicked fast, and runs in production in thousands of companies.” (https://kafka.apache.org/)
Apache Kafka
● In general every resource a service
owns should be published to a topic
● Every state change in the domain is
published as an event
● Those events are sent as messages
to topics
● Topics can be created when the first
event arrives
{
“id” : “4ea55fbb7c887”,
“key” : “7ebc8eeb1f2f45”,
“version” : 1,
“time” : "2018-02-22T17:05:55Z",
“type” : “customer-registered”,
“payload” : {
“id” : “7ebc8eeb1f2f45”,
“version” : 1,
“first_name” : “Paul”,
“last_name” : “Puschmann”,
“e-mail” : “bofh(at)rewe-digital.com”
}
}
What to send as messages
● Domain events are sent to topics
● Topics can have any number of subscribers
● Topics are split in partitions, the order is
only ensured inside a partition
● Each record has a sequential ID assigned
● Partitions are distributed over the Cluster
● Partitions can have a configurable number
of replicas
http://kafka.apache.org/documentation.html
Topics and their organisation
● An endless stream of events is stored in the
cluster
● Only the most recent version of an entity is
kept
● No point-in-time access to data
● Choose a wise key for your entity and
update a single entity always to this key
http://kafka.apache.org/documentation.html#compaction
Log-Compaction
● Every service which owns a resource should publish those resource-entities to a topic
● Use only one producer or make sure there are no issues about the order of events
● To enable log-compaction use a partitioner that ensures an event with the same key is always sent to
the same partition
● All producers should be able to republish all entities on request
A
B
C
D
<<publish>>
<<subscribe>>
<<subscribe>>
<<subscribe>>
topic
Producers
a
<<persist>>
Entity
Repo
Event
Repo
Published
Version
Repo
Producer
<<upsert>>
<<upsert>>
TX_1
<<delete>>
Topic
<<upsert>>
TX_2
<<publish>>
● The producer has to make sure that the
message is delivered and committed
● Therefore we store the raw event in a
database to enable retries until it’s
committed to the cluster
● Scheduled jobs can take care of retries
and cleanup
Producers
● Every service can consume every available data and should consume all data it needs to fulfill
a request - having data at request time is better than trying to get it from another service
● The consumer has to process events idempotently. An event could be consumed more than
once. The infrastructure ensures at-least-once delivery
● Consumers have to take care of deployment specialties like blue/green
● Consumers should be able to re-consume from the beginning. For instance more data is
needed
● Consumers only should persist the data they really need
Consumers
● The consumer is responsible for a
manual commit only after a successful
processing of the event. Successful can
mean:
● Needed data from an event is saved in
the services data-store
● The event can’t be processed and is
stored in a private error queue / table
Entity
Repo
Error
Repo
Processed
Version
Repo
Consumer
<<upsert>>
Topic
<<subscribe>>
<<upsert>>
<<insert>>
Consumers
● The consumer is responsible for a
manual commit only after a successful
processing of the event. Successful can
mean:
● Needed data from an event is saved in
the services data-store
● The event can’t be processed and is
stored in a private error queue / table
Entity
Repo
Error
Repo
Processed
Version
Repo
Consumer
<<upsert>>
Topic
<<subscribe>>
<<upsert>>
<<insert>>
Kafka & Ops
Eventing benefits for Operation
Pros Contras
• Each service has its own database:
This impacts / supports migrations, query
tuning, database usage
• Topic replication:
The replication of topics to different brokers
offer support for second datacenters or
migration to different environments
• Asynchrony:
Services don’t need no do synchronous calls
to share their data with other services
• Another super important service:
Kafka is the hub of your business data.
Take care about this.
• Redundancy of data:
Your databases will store the same data, or
subsets, more than once
• Asynchrony:
A consumer may not be up-to-date with
some topics, this might lead to
inconsistencies, e.g. in the frontend
Kafka & Ops
Eventing benefits for Operation
Kafka & Ops
Eventing benefits for Operation
By using the concept of “Kafka-mirrors”,
you can push selected topics to a different Kafka-Cluster (one-way).
This way you easily can setup services as consumers at a different datacenter.
For producers you’d shut down the producer, switch the direction
of the “kafka-mirror” and the start the producer “at the other side”.
Optionally: delete the topic, create and fill it anew.
Kafka can help you to migrate between providers / datacenters. —> NICE!
Different things
• We run Logstash on each VM to read, filter and ship log-events
to have consistency in log-mangling per VM (for infrastructure services / daemons).
• Each application (microservice, rest of the monolith) is forced to write log-events in a
defined JSON-structure.
• Our central “logindexers” only have to do one thing:
Read from Redis and deliver to Elasticsearch as fast as possible.
(we at least think are doing differently)
What do we do differently?
Helpful things
● Strong Architecture-Guild:
- Eventing-Guide
- API-Guide
- … and many more
● Active Communication of changes & constraints
● Monthly / Bi-monthly Bootcamps for (new) colleagues
What helped us most?
• Introduction of Eventing (with Kafka)
• Make development teams analyse logs & metrics on their own
• Strong usage of ELK
• Strong usage of Prometheus
• External traffic (Web, mobile App, partners)
always has to get routed through a gateway
Continued …
What helped us most?
What we Learned
• Communication is a key factor
• Automation pays off
• Eventing with Kafka is cool
• Temporary solutions last very long
• The knowledge / distribution of RACI helps
• UBIURI (you built it, you run it) is not only an option
What did we learn?
We did try to scoop out the Monolith. —> That was not a good idea.
Perhaps better:
Put a gateway in front of your legacy-application and switch resource by ressource.
Every service must have an owner!
What did we learn?
Continued …
The Future
Our Future
For Eventing-Foo:
• Markus Rebbert @mrebbert
• Sebastian Gauder @rattakresch
• Ansgar Brauner @a_brauner
For motivating me to give this talk
• Matthias Jambor @grambulf
Thanks go to:
Thank You!
Paul Puschmann I @ppuschmann I www.rewe-digital.com I @rewedigitaltech
All background pictures are licensed under CC0. Source: pexels.com
From Monolith to Microservices
OSDC 2018, Berlin

Weitere ähnliche Inhalte

Was ist angesagt?

Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017
Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017
Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017Codemotion
 
KNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADS
KNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADSKNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADS
KNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADSElad Hirsch
 
2017 Microservices Practitioner Virtual Summit: The Mechanics of Deploying En...
2017 Microservices Practitioner Virtual Summit: The Mechanics of Deploying En...2017 Microservices Practitioner Virtual Summit: The Mechanics of Deploying En...
2017 Microservices Practitioner Virtual Summit: The Mechanics of Deploying En...Ambassador Labs
 
Kafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appKafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appNeil Avery
 
Exclusive Producer: Using Pulsar to Build Distributed Applications - Pulsar S...
Exclusive Producer: Using Pulsar to Build Distributed Applications - Pulsar S...Exclusive Producer: Using Pulsar to Build Distributed Applications - Pulsar S...
Exclusive Producer: Using Pulsar to Build Distributed Applications - Pulsar S...StreamNative
 
"[WORKSHOP] K8S for developers", Denis Romanuk
"[WORKSHOP] K8S for developers", Denis Romanuk"[WORKSHOP] K8S for developers", Denis Romanuk
"[WORKSHOP] K8S for developers", Denis RomanukFwdays
 
Innovating faster with SBT, Continuous Delivery, and LXC
Innovating faster with SBT, Continuous Delivery, and LXCInnovating faster with SBT, Continuous Delivery, and LXC
Innovating faster with SBT, Continuous Delivery, and LXCkscaldef
 
BlaBlaCar and infrastructure automation
BlaBlaCar and infrastructure automationBlaBlaCar and infrastructure automation
BlaBlaCar and infrastructure automationsinfomicien
 
Microservices: The Right Way
Microservices: The Right WayMicroservices: The Right Way
Microservices: The Right WayDaniel Woods
 
Let the alpakka pull your stream
Let the alpakka pull your streamLet the alpakka pull your stream
Let the alpakka pull your streamEnno Runne
 
Gluecon - Kafka and the service mesh
Gluecon - Kafka and the service meshGluecon - Kafka and the service mesh
Gluecon - Kafka and the service meshGwen (Chen) Shapira
 
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...confluent
 
Bootstrapping Microservices with Kafka, Akka and Spark
Bootstrapping Microservices with Kafka, Akka and SparkBootstrapping Microservices with Kafka, Akka and Spark
Bootstrapping Microservices with Kafka, Akka and SparkAlex Silva
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Serverless and Servicefull Applications - Where Microservices complements Ser...
Serverless and Servicefull Applications - Where Microservices complements Ser...Serverless and Servicefull Applications - Where Microservices complements Ser...
Serverless and Servicefull Applications - Where Microservices complements Ser...Red Hat Developers
 
Kubernetes go paddle meetup
Kubernetes go paddle meetupKubernetes go paddle meetup
Kubernetes go paddle meetupSujai Prakasam
 
KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka S...
KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka S...KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka S...
KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka S...confluent
 
Design and Implementation of Incremental Cooperative Rebalancing
Design and Implementation of Incremental Cooperative RebalancingDesign and Implementation of Incremental Cooperative Rebalancing
Design and Implementation of Incremental Cooperative Rebalancingconfluent
 
101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)Henning Spjelkavik
 

Was ist angesagt? (20)

Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017
Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017
Container orchestration: the cold war - Giulio De Donato - Codemotion Rome 2017
 
KNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADS
KNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADSKNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADS
KNATIVE - DEPLOY, AND MANAGE MODERN CONTAINER-BASED SERVERLESS WORKLOADS
 
2017 Microservices Practitioner Virtual Summit: The Mechanics of Deploying En...
2017 Microservices Practitioner Virtual Summit: The Mechanics of Deploying En...2017 Microservices Practitioner Virtual Summit: The Mechanics of Deploying En...
2017 Microservices Practitioner Virtual Summit: The Mechanics of Deploying En...
 
Kafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appKafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming app
 
Exclusive Producer: Using Pulsar to Build Distributed Applications - Pulsar S...
Exclusive Producer: Using Pulsar to Build Distributed Applications - Pulsar S...Exclusive Producer: Using Pulsar to Build Distributed Applications - Pulsar S...
Exclusive Producer: Using Pulsar to Build Distributed Applications - Pulsar S...
 
"[WORKSHOP] K8S for developers", Denis Romanuk
"[WORKSHOP] K8S for developers", Denis Romanuk"[WORKSHOP] K8S for developers", Denis Romanuk
"[WORKSHOP] K8S for developers", Denis Romanuk
 
Innovating faster with SBT, Continuous Delivery, and LXC
Innovating faster with SBT, Continuous Delivery, and LXCInnovating faster with SBT, Continuous Delivery, and LXC
Innovating faster with SBT, Continuous Delivery, and LXC
 
Docker+java
Docker+javaDocker+java
Docker+java
 
BlaBlaCar and infrastructure automation
BlaBlaCar and infrastructure automationBlaBlaCar and infrastructure automation
BlaBlaCar and infrastructure automation
 
Microservices: The Right Way
Microservices: The Right WayMicroservices: The Right Way
Microservices: The Right Way
 
Let the alpakka pull your stream
Let the alpakka pull your streamLet the alpakka pull your stream
Let the alpakka pull your stream
 
Gluecon - Kafka and the service mesh
Gluecon - Kafka and the service meshGluecon - Kafka and the service mesh
Gluecon - Kafka and the service mesh
 
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
Flexible Authentication Strategies with SASL/OAUTHBEARER (Michael Kaminski, T...
 
Bootstrapping Microservices with Kafka, Akka and Spark
Bootstrapping Microservices with Kafka, Akka and SparkBootstrapping Microservices with Kafka, Akka and Spark
Bootstrapping Microservices with Kafka, Akka and Spark
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Serverless and Servicefull Applications - Where Microservices complements Ser...
Serverless and Servicefull Applications - Where Microservices complements Ser...Serverless and Servicefull Applications - Where Microservices complements Ser...
Serverless and Servicefull Applications - Where Microservices complements Ser...
 
Kubernetes go paddle meetup
Kubernetes go paddle meetupKubernetes go paddle meetup
Kubernetes go paddle meetup
 
KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka S...
KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka S...KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka S...
KSQL Performance Tuning for Fun and Profit ( Nick Dearden, Confluent) Kafka S...
 
Design and Implementation of Incremental Cooperative Rebalancing
Design and Implementation of Incremental Cooperative RebalancingDesign and Implementation of Incremental Cooperative Rebalancing
Design and Implementation of Incremental Cooperative Rebalancing
 
101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)
 

Ähnlich wie OSDC 2018 | From Monolith to Microservices by Paul Puschmann_

Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kuberneteskloia
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetesRishabh Indoria
 
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerSpinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerAndrew Phillips
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...StreamNative
 
Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Eric Sammer
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaTreasure Data, Inc.
 
Devops - Microservice and Kubernetes
Devops - Microservice and KubernetesDevops - Microservice and Kubernetes
Devops - Microservice and KubernetesNodeXperts
 
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...Outlyer
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftYaniv cohen
 
MRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
MRA AMA Part 10: Kubernetes and the Microservices Reference ArchitectureMRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
MRA AMA Part 10: Kubernetes and the Microservices Reference ArchitectureNGINX, Inc.
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsMonal Daxini
 
Domain events & Kafka in Ruby applications
Domain events & Kafka in Ruby applicationsDomain events & Kafka in Ruby applications
Domain events & Kafka in Ruby applicationsSpyros Livathinos
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for BeginnersDigitalOcean
 
Successful DevOps implementation for small teams a true story
Successful DevOps implementation for small teams  a true storySuccessful DevOps implementation for small teams  a true story
Successful DevOps implementation for small teams a true storyJakub Paweł Głazik
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...confluent
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uberconfluent
 
Kafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notificationsKafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notificationsSérgio Nunes
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...Amazon Web Services
 
GCCP JSCOE Session 2
GCCP JSCOE Session 2GCCP JSCOE Session 2
GCCP JSCOE Session 2GDSC
 

Ähnlich wie OSDC 2018 | From Monolith to Microservices by Paul Puschmann_ (20)

Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with SpinnakerSpinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
Spinnaker Summit 2018: CI/CD Patterns for Kubernetes with Spinnaker
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015Building a system for machine and event-oriented data - Data Day Seattle 2015
Building a system for machine and event-oriented data - Data Day Seattle 2015
 
Building a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with RocanaBuilding a system for machine and event-oriented data with Rocana
Building a system for machine and event-oriented data with Rocana
 
Devops - Microservice and Kubernetes
Devops - Microservice and KubernetesDevops - Microservice and Kubernetes
Devops - Microservice and Kubernetes
 
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
Owain Perry (Just Giving) - Continuous Delivery of Windows Micro-Services in ...
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
 
MRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
MRA AMA Part 10: Kubernetes and the Microservices Reference ArchitectureMRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
MRA AMA Part 10: Kubernetes and the Microservices Reference Architecture
 
The Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data ProblemsThe Netflix Way to deal with Big Data Problems
The Netflix Way to deal with Big Data Problems
 
Domain events & Kafka in Ruby applications
Domain events & Kafka in Ruby applicationsDomain events & Kafka in Ruby applications
Domain events & Kafka in Ruby applications
 
Kubernetes for Beginners
Kubernetes for BeginnersKubernetes for Beginners
Kubernetes for Beginners
 
Successful DevOps implementation for small teams a true story
Successful DevOps implementation for small teams  a true storySuccessful DevOps implementation for small teams  a true story
Successful DevOps implementation for small teams a true story
 
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
 
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ UberKafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
Kafka Summit NYC 2017 - Scalable Real-Time Complex Event Processing @ Uber
 
Kafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notificationsKafka used at scale to deliver real-time notifications
Kafka used at scale to deliver real-time notifications
 
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
 
GCCP JSCOE Session 2
GCCP JSCOE Session 2GCCP JSCOE Session 2
GCCP JSCOE Session 2
 

Kürzlich hochgeladen

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Kürzlich hochgeladen (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

OSDC 2018 | From Monolith to Microservices by Paul Puschmann_

  • 1. From Monolith to Microservices Paul Puschmann OSDC 2018, Berlin
  • 3. 3 Details REWE GROUP Turnover >54 bn History > 90 years Employees >330.000 Industries Food Retail, Tourism, DIY Shops >15.000
  • 4. 4 What do we actually run?
  • 5. Our history at REWE Digital
  • 6. ● Have a good platform / software architecture ● Scale the application ● Enable fast delivery of features, accelerate the business Main Goals
  • 7. Our history at REWE Digital 2014 40 15 100 28 150 51 2015 2016 2017 # Services # Teams 1 2 200 51 2018 This is an approximation...
  • 8. ● Take care of our “Managed-Hosting” ● Re-automate an already existing PROD-environment with Ansible ● Keep everything running ● Support our developers ● Pager-Duty What did the Ops-people do?
  • 9. ● Integration of new features — difficult ● Deployments every two weeks — slow ● Deployments took eventually 1h — slow ● “Everything” in the monolith (plus databases) — dependency constraints Status Quo of the Monolith 2014 / beginning of year 2015
  • 10. Wishes of the stakeholders: ● Features, features, features ● Application must not break Developers ● We want to code, test, deploy Ops ● Really!? Wishes Beginning of year 2015
  • 11. ● New features aren’t built into the monolith anymore, but as a separate applications ● There are strong guidelines regarding - API - Monitoring interfaces - Logging Beginning of 2015 The Plan
  • 12. ● While building new functionality in to micro-services, existing features were extracted from the monolith (scoop out) to allow faster, independent development of features ● This should remove all BL from the monolith soon Continued… The Plan
  • 13. The pressure of having a perfectly working runtime-environment soon was quite high. Pragmatic decision: poor-mans micro-services ● @devs: please package your app in a .deb-package ● we do the rest via Ansible and HAProxy Bring this to life, then move on Containers? Yes, but no. How should we manage all those new applications?
  • 15. Should we use the early versions of Kubernetes or Mesosphere Marathon? No. We wanted to have an environment we were able to understand, automate and manage. So we created a custom Docker-environment with Docker & Swarm. Containers? Yes, but how? How should we manage all those containers?
  • 16. ● Debian machines (VMs & Metal) ● Docker-CE ● Docker Swarm (not swarm-mode) ● Consul ● Consul-template ● Dnsmasq ● Nginx ● Deployment with Ansible ● Secrets managed by “Ops" - sorry, no “Hashicorp Vault”, yet Our solution consists of…
  • 17. We say: No. Because we created a solution we were capable to run and fits our needs. “Best Practises” don’t work for everybody. Only downsides: • Docker swarm isn’t good at “deploy spread” • We’ve no orchestration-service that ensures our containers are running fine and in the right Reinventing the wheel?
  • 18. ● Blue / Green deployment ● Semi-automatic configuration / deployment of new databases ● Fully automated set-up of Team-Jenkins instances (even at GCP) ● A custom-built platform-service called “Slash” as a service-interface for our developers ● Use a common set of Docker base-images with daily builds Additional goodies
  • 19. Docker-Images Automatic build-chains with night builds for our Docker base-images and dependent images. Example: Java-base images
  • 20. All is fine now? What about ● Monitoring —> Check. ● Responsibility —> ? ● THE MONOLITH?
  • 21. The Monolith… … is still in production.
  • 23. Our 45 teams are developing and running more than 200 services Imagine if all of them talk to each other: Scale at Servicelevel
  • 24. Our 45 teams are developing and running more than 200 services Imagine if all of them talk to each other: Scale at Servicelevel
  • 25. Our 45 teams are developing and running more than 200 services Imagine if all of them talk to each other: Scale at Servicelevel
  • 26. Challenges in HTTP/REST-only architectures Gateway µService 1 µService 2 µService 5µService 4 Things that help: ● API-Guidelines ● Timeouts ● Fallbacks ● Circuit Breakers ● Eventing µService 3
  • 28. ● Enable services to provide themselves with data asynchronously before it is needed in a request ● Kind of database replication ● More performance & stability Service A a Service B Service C ca cba’ c’ What is the goal of Eventing? produce produceconsumeconsume
  • 29. Representation of something that happened in the domain (Evans) ● An event concerns: ○ one domain entity (e.g. “customer”, “shopping cart”, “order”, “delivery area”) ○ and one state change that happened to that entity (e.g. “customer registered”, “item added to shopping cart”, “order fulfilled”, “delivery area expanded”) ● Event = functional object to describe domain changes ● Event = vehicle for database replication What is a (domain-) event?
  • 30. ● ID: Unique identifier ● Key: Which entity is affected? ● Version: Which version of this entity is this? ● Time: When did the event occur? ● Type: What kind of action happened? ● Payload: What are the details? ○ Entire entity - not deltas! { “id” : “4ea55fbb7c887”, “key” : “7ebc8eeb1f2f45”, “version” : 1, “time” : "2018-02-22T17:05:55Z", “type” : “customer-registered”, “payload” : { “id” : “7ebc8eeb1f2f45”, “version” : 1, “first_name” : “Paul”, “last_name” : “Puschmann”, “e-mail” : “bofh(at)rewe-digital.com” } } Technical Event
  • 31. customer data customer topic Customer Data Service <<publish Event>> Invoice Service customer data’ <<subscribe Event>> Loyalty Service customer data’’ <<subscribe Event>> . . . “payload”: { “customer_uuid” : ”876ef6e5”, “version” : 3, “name” : “Peter Smith”, “loyalty_id” : “477183877”, “invoice_address” : “752 High Street”, “delivery_address” : “67 Liverpool Street” } “customer_uuid” : ”876ef6e5”, “version” : 3, “name” : “Peter Smith”, “invoice_address” : “752 High Street” “customer_uuid” : ”876ef6e5”, “version” : 3, “name” : “Peter Smith”, “loyalty_id” : “477183877” Example: Customer Data
  • 33. ● Open-source stream processing platform written in Scala and Java ● High-throughput, low-latency platform for real-time data streams ● Originally developed at Linkedin, open sourced in 2011 ● Offers 4 APIs: Producer, Consumer, Stream, Connect ● We use Apache Kafka in a pub-sub manner. This means most of our services use the Producer and Consumer APIs “Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.” (https://kafka.apache.org/) Apache Kafka
  • 34. ● In general every resource a service owns should be published to a topic ● Every state change in the domain is published as an event ● Those events are sent as messages to topics ● Topics can be created when the first event arrives { “id” : “4ea55fbb7c887”, “key” : “7ebc8eeb1f2f45”, “version” : 1, “time” : "2018-02-22T17:05:55Z", “type” : “customer-registered”, “payload” : { “id” : “7ebc8eeb1f2f45”, “version” : 1, “first_name” : “Paul”, “last_name” : “Puschmann”, “e-mail” : “bofh(at)rewe-digital.com” } } What to send as messages
  • 35. ● Domain events are sent to topics ● Topics can have any number of subscribers ● Topics are split in partitions, the order is only ensured inside a partition ● Each record has a sequential ID assigned ● Partitions are distributed over the Cluster ● Partitions can have a configurable number of replicas http://kafka.apache.org/documentation.html Topics and their organisation
  • 36. ● An endless stream of events is stored in the cluster ● Only the most recent version of an entity is kept ● No point-in-time access to data ● Choose a wise key for your entity and update a single entity always to this key http://kafka.apache.org/documentation.html#compaction Log-Compaction
  • 37. ● Every service which owns a resource should publish those resource-entities to a topic ● Use only one producer or make sure there are no issues about the order of events ● To enable log-compaction use a partitioner that ensures an event with the same key is always sent to the same partition ● All producers should be able to republish all entities on request A B C D <<publish>> <<subscribe>> <<subscribe>> <<subscribe>> topic Producers a <<persist>>
  • 38. Entity Repo Event Repo Published Version Repo Producer <<upsert>> <<upsert>> TX_1 <<delete>> Topic <<upsert>> TX_2 <<publish>> ● The producer has to make sure that the message is delivered and committed ● Therefore we store the raw event in a database to enable retries until it’s committed to the cluster ● Scheduled jobs can take care of retries and cleanup Producers
  • 39. ● Every service can consume every available data and should consume all data it needs to fulfill a request - having data at request time is better than trying to get it from another service ● The consumer has to process events idempotently. An event could be consumed more than once. The infrastructure ensures at-least-once delivery ● Consumers have to take care of deployment specialties like blue/green ● Consumers should be able to re-consume from the beginning. For instance more data is needed ● Consumers only should persist the data they really need Consumers
  • 40. ● The consumer is responsible for a manual commit only after a successful processing of the event. Successful can mean: ● Needed data from an event is saved in the services data-store ● The event can’t be processed and is stored in a private error queue / table Entity Repo Error Repo Processed Version Repo Consumer <<upsert>> Topic <<subscribe>> <<upsert>> <<insert>> Consumers
  • 41. ● The consumer is responsible for a manual commit only after a successful processing of the event. Successful can mean: ● Needed data from an event is saved in the services data-store ● The event can’t be processed and is stored in a private error queue / table Entity Repo Error Repo Processed Version Repo Consumer <<upsert>> Topic <<subscribe>> <<upsert>> <<insert>> Kafka & Ops Eventing benefits for Operation
  • 42. Pros Contras • Each service has its own database: This impacts / supports migrations, query tuning, database usage • Topic replication: The replication of topics to different brokers offer support for second datacenters or migration to different environments • Asynchrony: Services don’t need no do synchronous calls to share their data with other services • Another super important service: Kafka is the hub of your business data. Take care about this. • Redundancy of data: Your databases will store the same data, or subsets, more than once • Asynchrony: A consumer may not be up-to-date with some topics, this might lead to inconsistencies, e.g. in the frontend Kafka & Ops Eventing benefits for Operation
  • 43. Kafka & Ops Eventing benefits for Operation By using the concept of “Kafka-mirrors”, you can push selected topics to a different Kafka-Cluster (one-way). This way you easily can setup services as consumers at a different datacenter. For producers you’d shut down the producer, switch the direction of the “kafka-mirror” and the start the producer “at the other side”. Optionally: delete the topic, create and fill it anew. Kafka can help you to migrate between providers / datacenters. —> NICE!
  • 45. • We run Logstash on each VM to read, filter and ship log-events to have consistency in log-mangling per VM (for infrastructure services / daemons). • Each application (microservice, rest of the monolith) is forced to write log-events in a defined JSON-structure. • Our central “logindexers” only have to do one thing: Read from Redis and deliver to Elasticsearch as fast as possible. (we at least think are doing differently) What do we do differently?
  • 47. ● Strong Architecture-Guild: - Eventing-Guide - API-Guide - … and many more ● Active Communication of changes & constraints ● Monthly / Bi-monthly Bootcamps for (new) colleagues What helped us most?
  • 48. • Introduction of Eventing (with Kafka) • Make development teams analyse logs & metrics on their own • Strong usage of ELK • Strong usage of Prometheus • External traffic (Web, mobile App, partners) always has to get routed through a gateway Continued … What helped us most?
  • 50. • Communication is a key factor • Automation pays off • Eventing with Kafka is cool • Temporary solutions last very long • The knowledge / distribution of RACI helps • UBIURI (you built it, you run it) is not only an option What did we learn?
  • 51. We did try to scoop out the Monolith. —> That was not a good idea. Perhaps better: Put a gateway in front of your legacy-application and switch resource by ressource. Every service must have an owner! What did we learn? Continued …
  • 54. For Eventing-Foo: • Markus Rebbert @mrebbert • Sebastian Gauder @rattakresch • Ansgar Brauner @a_brauner For motivating me to give this talk • Matthias Jambor @grambulf Thanks go to:
  • 55. Thank You! Paul Puschmann I @ppuschmann I www.rewe-digital.com I @rewedigitaltech All background pictures are licensed under CC0. Source: pexels.com From Monolith to Microservices OSDC 2018, Berlin