Kafka has become more than a simple message bus: with a full stack of tooling and new concepts, it’s easy to start deploying complex service meshes, communicating through Kafka, enabling decoupled microservices, stable performance, high scalability and reusability.
As the service mesh grows, so does complexity and management. Maybe we need to add a new validation or computational step. Maybe we need to reorder how we process messages from one topic to the other. But how can we manage this change, avoiding either making our microservices monoliths (distributed monolith), leading to complex and dangerous weekend reconfigurations?
During our presentation, we will present an easy to implement solution over Kafka by injecting sagas (DAGs describing the execution path) into messages. This allows us to build simple and stateless microservices, providing an easy to reason about solution that is inherently extensible.
AWS Community Day CPH - Three problems of Terraform
Designing a Service Mesh with Kafka and Sagas | David Navalho, Marionete and Ricardo Miranda, Closer.pt
1. with Kafka and Sagas
1
David Navalho & Ricardo Miranda
Designing a Service Mesh
2. Who we are
Ricardo Miranda
Senior Consultant @ Closer.pt
mail@ricardoMiranda.com
2
David Navalho
Head of Core Technologies @ Marionete
david.navalho@marionete.co.uk
3. Context
Develop new complex project
• In very reduced time (Agile! Results now!)
• Needs to integrate with existing data, services
• Increased resources vs increased productivity
• Reusability and Simplicity
3
4. Context
• Monolith is not the way
• Challenging to scale development teams
• Future maintenance and scalability challenging
• Microservices
• Simplified work and produces early results
• Initial maintenance is simple
• Coordination, communication, operations complexity increase with time
4
5. Agenda
• Microservices
• Service Mesh
• Kafka/Confluent
• Sagas
• What is a Saga
• Orchestration vs Choreography
• Anatomy of a Record with Sagas
• Solution Overview
• Conclusions
5
13. “It is a proxy”
– Gwen Shapira, Confluent
What’s a Service Mesh?
13
Kafka and the Service Mesh | Gwen Shapira, Confluent
https://www.youtube.com/watch?v=Fi292CqOm8A
27. What is a Saga
A saga is a mechanism to implement
transactions that span multiple services
27
28. What is a Saga
A saga is a mechanism to implement
transactions that span multiple services
28
29. What is a Saga
A saga is a mechanism to implement
transactions that span multiple services
29
30. What is a Saga - failures happen
If one of the services fails it sends
messages to other services, so they can
execute compensating transactions
30
31. What is a Saga - failures happen
Compensating transactions
may require rollback a
database to previous state
31
55. Conclusions
• Simplify development, operational burden
• Kafka and Service Mesh
• Keep track and manage E2E scenarios
• Sagas
• Easy maintenance, configuration
• CI/CD
• Fast development cycles
55
56. Useful resources
• Kafka and the Service Mesh | Gwen Shapira, Confluent
• https://www.youtube.com/watch?v=Fi292CqOm8A
• Using sagas to maintain data consistency in a microservice
architecture | Chris Richardson
• https://www.youtube.com/watch?v=YPbGW3Fnmbc
56
Comon scenario: we have data and want to do “stuff” with it
Increasingly common to process, evaluate, run through ML algorithms, etc, etc.
Model Training, Models to production
=> helps if we give a bit of overall context about the business and why it was important to your stakeholders
++ people == more productivity???
New data! New insights!
Monolith is not the way
Microservices: simplified work and produce early results
New microservices; hard to integrate/coordinate with existing stack
Will assume some basic knowledge and understanding of Kafka and microservices in general
What we are going to present
Is there a better a way
How can we accelerate development
How to maintain, thinking on the future
Birds eye view
Splitting the logic through Microservices:
focused, fast development! => Can split into multiple teams!
Easy to maintain
…and to add new services
Easy to scale!
Operationally, elasticity
Hard to manage increasing amount of services
Reusing services forces tight coupling
Same problems addressed repeatedly in different ways
Big jigsaw puzzle to manage and operate => how can we simplify/ease this process
Add source here (and at the end)
Sidecar: encapsulates logicof the proxy, and deploy it with every single microservice. Everything talks directly with the proxy. Applications are composed of heterogenous components
https://www.youtube.com/watch?v=Fi292CqOm8A
Add source here (and at the end)
Sidecar: encapsulates logicof the proxy, and deploy it with every single microservice. Everything talks directly with the proxy. Applications are composed of heterogenous components
https://www.youtube.com/watch?v=Fi292CqOm8A
Still hard to add new services, or even new requirements into existing services
APIs still need to be taken into consideration, synchronous communication
Still hard to add new services, or even new requirements into existing services
APIs still need to be taken into consideration, synchronous communication
Won’t go into detail – load balancing. Operational complexity is simplified => development time reduced
Note: review high level definition of service mesh <- intercepts traffic. Allows for many levels of control
Synchronous Communication
Decoupling of services
Communication patterns simplified
Message delivery guarantees/simplification
Decoupling of services
Communication patterns simplified
Reactive Architecture
Kafka allows decoupling services => asynchronous => simplifies service
SR allows proper communication patterns and independent evolution
APIs between services are contracts => Event Schemas ARE the API…but better!
No need to coordinate service deployment
Easy to scale, easy to deploy
Easy to scale, easy to deploy
… and adding new services is easy! Insta-reusability!
Simplified development, operations, etc
Big Picture? Where does our application start? What if we want to change the logic? Add an additional validation step in the middle for a specific service?
We still require some local state managemnt on services, making it even harder to manage, recover, maintain, keep logic, add new pipeline paths, etc
How do we: keep track of a pipeline logic? Change it according to needs? Add new services as part of the overal logic? Replay older pipelines?
Which topic to write after a failure?
One? Depends on the failure?
Sagas might help us solve this new problem
Big Picture? Where does our application start? What if we want to change the logic? Add an additional validation step in the middle for a specific service?
We still require some local state managemnt on services, making it even harder to manage, recover, maintain, keep logic, add new pipeline paths, etc
How do we: keep track of a pipeline logic? Change it according to needs? Add new services as part of the overal logic? Replay older pipelines?
Which topic to write after a failure?
One? Depends on the failure?
Sagas might help us solve this new problem
Now we are approaching the main topic of this presentation. It is a way to unify under a logical transaction a sequence of actions that span several services. We did not go with 2 steps commits because of several problems, for example dead locks. Saga is a consept from the '70s, in the database world, to allow complex rollbacks.
A Saga describes the way business logic requires information to flow through services.
In a straightforward way if we follow the happy path.
In the case of a failure or ilegal request it is necessary to launch compesating transactions for every upstream service. Notice that this leads to eventual consistent systems.
And compensating transactions may be arbitrarlily complex, for instance if there internal state to revert.
Sagas can be implemented with both Orchestration and Choreography paradigms. As you noticed in this slide, we believe Kafka is a perfect fit for Choreography.
With Orchestration a Orchestrator service is mandatory. In some projects we came to the conclusion that using an Orchestrator introduces a lot of accidental complexity.
And the complexity increases dramatically with the introduction of compensating transactions.
Imagine several business processes and it grows to a monolith in its own right.
But Orchestration in itself is not the solution either. If a naif approach is followed it could be even worst than Orchestration.
Disseminating the saga logic through every service leads to an expnential amount of state to maintain. It gets unmanageable very fast.
So we opted for a stateless solution for Sagas.
We inject the saga into the Kafka record and update the saga as it flows throgh the services.
Basically we need to inject to inject:
- DAG
- Pointer to current step
- unic identifier to saga instance
Record composed of a set of attributes, including a timestamp, and the necessary Key and Value fields.
…it also has a Headers field since 0.11!
Sagas are attached to messages as Headers with a DAG representing the Saga.
Headers => keep metadata
Record composed of a set of attributes, including a timestamp, and the necessary Key and Value fields.
…it also has a Headers field since 0.11!
Sagas are attached to messages as Headers with a DAG representing the Saga.
pointer is moved forward so that each service locates itself in the DAG, and is able to send a message to the appropriate next service
Id => instance of a saga
We presented a tried and tested methodology
Goes without saying: CI/CD
We can bake the cake and eat it too
Split development responsibilities
Fast, simple, safer