In the following slides, our dear colleagues Dimosthenis Botsaris and Alexandros Koufatzis are trying to explore Kafka and Event-Driven Architecture. They define what is the Kafka platform, how does it work and present one architecture approach using Event Driven Systems and Kafka. They also take a look on some core Kafka's configuration before they deploy it on production and discuss a few best approaches to have a reliable data delivery system using Kafka.
Check out the repository: https://github.com/arconsis/Eshop-EDA
2. Event Driven Architectures
BENEFITS
● Async communication between microservices based on events
● Decoupled microservices, enables teams to act more independently, increases velocity
● Using Kafka as the event backbone, increases resilience, applications can restart, recover events or replay
them
● Data store of immutable append-only logs
DRAWBACKS
● Need for handling of duplicate events
● Lack of clear workflow order
● Complex error handling
● Requires a comprehensive set of monitoring tools
4. Deployment Overview
● The project is deployed on AWS EKS using Terraform
● AWS RDS - Postgres is used for the database
● Fully Managed Apache Kafka - MSK
● Infrastructure as Code with Terraform
● Bastion service to provide an API for configuring Postgres and Kafka using Golang
5. Microservices Overview
● OrderService: Responsible for the order creation. Creates the initial event to the Orders Topic
● UserService: Handles user information, user addresses etc
● PaymentService: Contacts the payment provider and handles payment failures
● WarehouseService: Validates the orders based on the available items stock. Handles the
shipment of the order
● EmailService: Informs the user about the final status of the order by sending an email using an
external email provider
6. Single Writer Principle
● Responsibility for propagating events of a specific type to a single service
● Improves consistency and validation
● Team that manages a services has ownership of the schema and versioning of the events
● E.g. OrderServices controls every state change made to an Orders
9. Event structure
● Generic Message interface with payload and
messageId
● messageId should be a message specific id,
not the same as the unique id of a Domain
Object
● Record the Key-Value Pair send as an Event
to Kafka
● Key of the Record should be the unique id of
the Domain Object ( e.g. orderId)
interface Message <T> {
val payload: T
val messageId: UUID
}
data class OrderMessage(
override val payload: Order,
override val messageId: UUID
) : Message<Order>
fun Order.toOrderMessageRecord():
Record<String, OrderMessage> =
Record.of(
id.toString(),
OrderMessage(
payload = this,
messageId = UUID.randomUUID()
)
)
10. Event ordering
● Kafka topics are partitioned and spread over a
number of partitions
● Events with the same key are always written to
same the partition
● Orders Events with the same order id are always
written to the same partition
● This guarantees that events for the same order
with specific order status are put in one partition
with a particular order
enum class OrderStatus {
REQUESTED,
VALIDATED,
OUT_OF_STOCK,
PAID,
SHIPPED,
COMPLETED,
PAYMENT_FAILED,
CANCELLED,
REFUNDED,
SHIPMENT_FAILED
}
Record.of(
order.id,
OrderMessage(
payload = order,
messageId = UUID.randomUUID()
)
)
11. Avoiding duplicate Events
● Ideally Kafka would send every event once, but failures could lead to a Consumer receiving the same
event multiple times
● An offset (increasing integer value) is used to track which messages are already consumed
● Each message in a specific partition has a unique offset
● enable.auto.commit has to be set to false
● Idempotent event handlers can be used which behave like a Http PUT operation.
● Unfortunately not every operation is idempotent
12. Avoiding duplicate Events 2
● Tracking received events with the support of an
ACID database can help us guarantee exactly once
delivery
● When receiving an event we create a
ProcessedEvent object with eventId set to the
messageId included in the received event
● We wrap then any DB update in a transaction
with an record insert into the ProcessedEventEntity
table
● If an event is already processed the DB insert will
fail and the transaction will be rolled back
● Commit back the event as successfully processed
on Kafka, even in case of a received duplicate, so
that we don’t receive it again
data class ProcessedEvent(
val eventId: UUID,
val processedAt: Instant
)
@Entity
@Table(name = "processed_events")
class ProcessedEventEntity(
@Column(nullable = false, name =
"event_id")
@Id
var eventId: UUID,
@Column(nullable = false, name =
"processed_at")
var processedAt: Instant,
@CreationTimestamp
@Column(name = "created_at")
var createdAt: Instant? = null,
@UpdateTimestamp
@Column(name = "updated_at")
var updatedAt: Instant? = null
)
13. Kafka specific settings
● kafka.acks = all, number of acknowledgments the producer requires the leader to have received before considering a
request complete
○ Events written to the partition leader are not immediately readable by consumers
○ Only when all in-sync replicas have acknowledged the write the message is considered committed
○ This ensures that messages cannot be lost by a broker failure
● kafka.enable.idempotence = true
○ Producer will ensure that exactly one copy of each message is written in the stream
● kafka.enable.auto.commit = false
○ Controls if the consumer will commit offsets automatically. Using auto commit gives us “at least once”
delivery, as Kafka guarantees that no messages will be missed, but duplicates are possible.
15. Drawbacks and alternatives
DRAWBACKS
● No atomicity between Kafka and Database
● Complex to write and manage custom distributed transactions
● Undetermined state when some type of failures happen
○ No guarantee that the DB transaction will commit, if sending the Event in the middle of the transaction
○ If sending the Event after the DB transaction, no guarantee that it won’t crash when sending the Event
○ E.g. after creating a new Order in the DB when the OrderRequested Event to Kafka fails: Undetermined state
ALTERNATIVES
● CDC (Debezium) and Outbox Pattern
● Kafka Streams
16. Let’s see some code snippets!
https://github.com/arconsis/Eshop-EDA
17. References
● Kafka The Definitive Guide by Neha Narkhede, Gwen Shapira, Todd Palino:
https://www.amazon.com/Kafka-Definitive-Real-Time-Stream-Processing/dp/1491936169
● Designing event driven systems by Ben Stopford (https://www.confluent.io/designing-event-driven-
systems/)
The difference between Key and MessageId will be handled later when we talk about duplication
The difference between Key and MessageId will be handled later when we talk about duplication
The difference between Key and MessageId will be handled later when we talk about duplication
The difference between Key and MessageId will be handled later when we talk about duplication
The difference between Key and MessageId will be handled later when we talk about duplication
The difference between Key and MessageId will be handled later when we talk about duplication
TODO: Add some diagrams similar to the ones found here: https://betterprogramming.pub/how-to-handle-duplicate-messages-and-message-ordering-in-kafka-82e2fef82025