RabbitMQ and Apache Kafka are two popular messaging systems. RabbitMQ uses a push model where consumers register interest in queues and brokers push messages. It offers low latency but requires back pressure. Kafka uses a pull model where consumers pull messages from topics in batches. This improves throughput but can affect processing order. Both systems provide reliability through mechanisms like persistent messages, clustering, and mirrors/replicas. However, RabbitMQ prioritizes low latency while Kafka prioritizes high throughput.
3. Background
• Jack Vanlightly
• Cloud Architect and Data Engineer at SII Concatel, Barcelona
• Event-Driven Architectures
• Messaging Systems
• Cloud Automation
• Data Pipelines
4. RabbitMQ – Push Model
Producer Exchange Queue
Consumer
route
Consumer Push
- Long-lived TCP connection
- Consumer registers interest in queues
- Broker pushes messages down connection in
real-time
Producer Publish
- Send messages one at a time
pushpublish
Consumer
5. Producer
Topic A
(partition 2)
Consumer
Consumer Pull
- Long-lived TCP connection
- Consumer registers interest in a topic as part
of a consumer group
- Consumer makes requests for messages in
batches
Producer Publish
- Send messages in batches
Pull in batches
Publish in batches
Kafka – Pull Model
Topic A
(partition 1)
Topic A
(partition 3)
Consumer
6. RabbitMQ – Why Push?
The push model allows RabbitMQ to:
• Offer low latency messaging.
• Evenly distribute messages across competing consumers.
• Keep processing order closer to delivery order in the face of
competing consumers.
A push model requires Back-Pressure: Consumer Prefetch.
Pull
(Apache Kafka)
Push
(RabbitMQ)
VS Kafka – Why Pull?
Because each partition cannot be read by more than one consumer
of a consumer group, the consumer can pull batches of messages
without:
• affecting processing order
• affecting message distribution amongst consumers
Batching up of messages improves compression and throughput.
7. At-most-once.
This means that a message will never be delivered
more than once but messages might be lost.
At-least-once.
This means that we'll never lose a message but a
message might end up being delivered to a
consumer more than once.
Exactly-once.
The holy grail of messaging. All messages will be
delivered exactly one time.
Delivery vs Processing
Delivered twice to be processed once.
At-most-once
At-least-once
Message
Acknowledgement
Protocols
12. Low # of Messages in Flight = Low Throughput, Low Message Duplication on Failure
Large # of Messages in Flight = High Throughput, High Message Duplication on Failure
Publisher Exchange
1000 messages in flight
when connection fails
RabbitMQ – Producer Side Duplication
Publisher Exchange
1 message in flight
when connection fails
Resend 1000 messages
25% of the messages persisted to a
queue
Queues ends up with 250 duplicates
Message was persisted to a queue but
connection died before ack could be
sent.
Resend 1 message Queues ends up with 1 duplicate
(Resent custom header)
14. Consumer Acknowledgements
- basic.ack (all ok, remove from the queue!)
- basic.nack, redeliver=false (error, but remove
anyway)
- basic.nack, redeliver=true (error, please
redeliver)
- basic.reject (same as basic.nack but without
multiple flag support)
Acknowledgement Mode
- Auto Ack (Push me messages as fast as you
can!)
- Manual Ack (I will explicitly tell you when a
message can be removed from the queue)
RabbitMQ – Consumer Side Acknowledgements
Queue Consumer
Pushes 10 messages
Delivery tag: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
basic.ack: 1 multiple=false, basic.ack: 2 multiple=false
basic.ack: 3 multiple=false, basic.ack: 4 multiple=false
basic.ack: 5 multiple=false, basic.ack: 6 multiple=false
basic.ack: 7 multiple=false, basic.ack: 8 multiple=false
basic.ack: 9 multiple=false, basic.ack: 9 multiple=false
15. Redelivered Flag
Multiple Flag
Flags
- Multiple (I am
acknowledging
multiple
messages)
- Redelivered
(This message is
a redelivery)
RabbitMQ – Consumer Side Acknowledgements
Queue Consumer
Pushes 10 messages
Delivery tag: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
basic.ack: 6 multiple=true, basic.ack: 10 multiple=true
Queue
Consumer1. Pushes 1 message
2. basic.nack: 1 multiple=false redeliver=true
Consumer3. Delivers message again
with redelivered=true flag
4. basic.ack: 1 multiple=false
16. Low # of Messages in Flight = Low Throughput, Low Message Duplication on Failure
Large # of Messages in Flight = High Throughput, High Message Duplication on Failure
1000 messages in flight
when connection fails
RabbitMQ – Consumer Side Duplication
1 message in flight
when connection fails
Redeliver 1000 messages
25% of the messages processed, but
before ack could be sent when
connection failed
250 messages get processed twice
Message was processed but
connection died before ack could be
sent.
Redeliver 1 message 1 message gets processed twice
Queue
Queue
Consumer
Consumer
20. RabbitMQ – The Broker – Queue Mirrors
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
Queue C
Master
Queue C
Mirror
Queue D
(unmirrored)
Queue A
ha-mode = all
Queue B
ha-mode = exactly
ha-params = 3
Queue C
ha-mode = exactly
ha-params = 2
21. RabbitMQ – The Broker – Queue Mirrors
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
Queue C
Master
Queue C
Master (Promoted)
Queue D
(unmirrored)
Queue A
ha-mode = all
Queue B
ha-mode = exactly
ha-params = 3
Queue C
ha-mode = exactly
ha-params = 2
Queue C
Mirror
22. RabbitMQ – The Broker – Queue Mirrors
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Master (Promoted)
Queue B
Master
Queue B
Mirror
Queue C
Master
Queue C
Master
Queue D
(unmirrored)
Queue A
ha-mode = all
Queue B
ha-mode = exactly
ha-params = 3
Queue C
ha-mode = exactly
ha-params = 2
Queue C
Mirror
23. RabbitMQ – The Broker – Queue Mirrors
Broker 2
Queue A
Master
Broker 3
Queue A
Mirror
Queue B
Master
Queue B
Mirror
Queue C
Master
Queue C
Master
Queue A
ha-mode = all
Queue B
ha-mode = exactly
ha-params = 3
Queue C
ha-mode = exactly
ha-params = 2
Broker 1
Queue A
Mirror
Queue B
Mirror
Queue C
Mirror
24. RabbitMQ – The Broker – Queue Mirrors
Broker 2
Queue A
Master
Queue B
Master
Queue C
Master
Queue A
ha-mode = all
Queue B
ha-mode = exactly
ha-params = 3
Queue C
ha-mode = exactly
ha-params = 2
Broker 1
Queue A
Mirror
Queue B
Mirror
Queue C
Mirror
Broker 3
Queue A
Mirror
Queue B
Mirror
26. RabbitMQ – Queue Mirrors - Synchronization
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
1010 10
10 10 10
Three nodes, two mirrored queues each with 10 messages
27. RabbitMQ – Queue Mirrors - Synchronization
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
1010 10
10 10 10
Broker 3 is lost
28. RabbitMQ – Queue Mirrors - Synchronization
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
Broker 3 comes back.
Mirror A is automatically synchronized. Mirror B remains at 0 messages.
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
1010 10
10 10 0
29. RabbitMQ – Queue Mirrors - Synchronization
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
Each queue receives 10 more messages.
Broker 2 is lost. Queue A fails over to mirror 3 without data loss.
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Master
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
2020 20
20 20 10
30. RabbitMQ – Queue Mirrors - Synchronization
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
ha-promote-on-
failure = always
Each queue receives 10 more messages.
Broker 1 is lost. Queue B fails over to mirror 3 and loses 10 messages.
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Master
Queue B
Mirror
Queue B
Master
Queue B
Master
3030 30
30 30 20
31. RabbitMQ – Queue Mirrors - Synchronization
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
ha-promote-on-
failure = when-
synced
Alternate scenario: ha-promote-on-failure = when-synced
Queue B does not fail over as mirror 3 is unsynchronized.
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Master
Queue B
Mirror
Queue B
Master
Queue B
Mirror
3030 30
30 30 20
33. RabbitMQ – Queue Mirrors - Synchronization
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
100m100m 100m
100m 100m 100m
Three nodes, two mirrored queues each with 100 million messages
34. RabbitMQ – Queue Mirrors - Synchronization
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
Broker 3 is lost
100m100m 100m
100m 100m 100m
35. RabbitMQ – Queue Mirrors - Synchronization
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
Broker 3 comes back.
Queue A is unavailable due to synchronization.
Queue B is available but mirror on broker 3 remains at 0 messages.
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
100m100m *
100m 100m 0
36. RabbitMQ – Queue Mirrors - Synchronization
Queue A
ha-mode = all
ha-sync-mode =
automatic
Queue B
ha-mode = exactly
ha-params = 3
ha-sync-mode =
manual
Queue A synchronization completes and the queue becomes available again.
Broker 2
Queue A
Master
Broker 3Broker 1
Queue A
Mirror
Queue A
Mirror
Queue B
Mirror
Queue B
Master
Queue B
Mirror
100m100m 100m
100m 100m 0
37. Balancing Data Safety
with High Throughput
- Producers wait periodically
for acknowledgements
- Consumers group
acknowledgements with the
Multiple flag
- Persistent messages
- Cluster
- Queue mirroring with 1
mirror*
RabbitMQ
Optimizing for High
Throughput
- Producers fire and forget
- Consumers use auto-ack
mode
- Non-Persistent messages
- Non-Mirrored Queues
- Cluster for throughput
Optimizing for Data
Safety
- Producers wait for
acknowledgements after
each message or after small
number of messages
- Consumers acknowledge
each message individually
- Persistent messages
- Cluster for durability
- Queue mirroring with 2+
mirrors.
- promote-on-failure=when-
synced
- ha-sync-mode=automatic
for active queues
40. Apache Kafka – Replicated Partitions
Broker 2
Partition 0
Leader
Broker 3Broker 1
Partition 0
Follower
Partition 0
Follower
Each partition has a leader with 0 or more Followers.
Producers send to leaders. Consumers consume from leaders.
Followers exist for redundancy.
Producer
41. Apache Kafka – Producer Side Acknowledgements
acks
- No acknowledgement (fire
and forget). Acks=0
- Leader has persisted the
message. Acks=1
- Leader and all In-Sync
Replicas have persisted the
message. Acks=All
Producer Config
Other settings
- retries
- enable.idempotence
(limits throughput)
- max.in.flight.requests.per.
connection
- batch.size
- max.request.size
Broker/Topic Config
Other settings
- default.replication.factor
- min.insync.replicas
- unclean.leader.election.enable
42. Retries
+
Multiple Requests
In Flight
=
Message Duplication
+
Out of Order
Messages
Consumer Group
Producer
P
0
P
1
P
2
C1 C2 C3
Batch 1
Batch 2
Batch 3
Batch 1
Batch 2
Batch 3
Batch 1
Batch 2
Batch 3
43. Retries
+
Multiple Requests
In Flight
=
Message Duplication
+
Out of Order
Messages
Consumer Group
Producer
C1 C2 C3
Batch 1
Batch 2
Batch 3
Batch 1
Batch 2
Batch 3
Batch 1
Batch 2
Batch 3
P
0
P
1
P
2
44. Retries
+
Multiple Requests
In Flight
=
Message Duplication
+
Out of Order
Messages
Consumer Group
Producer
C1 C2 C3
Batch 1
Batch 2
Batch 4
Batch 4
Batch 5
Batch 1
Batch 4
Batch 5
Batch 1
Batch 3
Batch 1
Batch 2
Batch 3
Batch 2
Batch 3
Batch 5
P
0
P
1
P
2
45. Retries
+
Multiple Requests
In Flight
=
Message Duplication
+
Out of Order
Messages
Consumer Group
Producer
C1 C2 C3
Batch 4
Batch 5
Batch 1
Batch 4
Batch 5
Batch 1
Batch 3
Batch 1
Batch 2
Batch 3
Batch 2
Batch 3
Batch 1
Batch 2
Batch 4
Batch 5
P
0
P
1
P
2
46. Avoid Data Loss
and
Producer-Side
Duplicaction
With
enable.idempotence =
true
• enable.idempotence set to true
• max.in.flight.requests.per.connection
set to 5 or less
• retries set to 1 or higher
• acks set to ‘all’
Producers can retry until they succeed
while avoiding message duplication.
53. Apache Kafka – The Broker – The ISR
In-Sync Replica Set (ISR)
- The leader + the followers who are up to date with the leader
- A follower is removed from the ISR when either:
- It has not sent any fetch requests to the leader with the replica.lag.time.max.ms
time period
- Has not been up to date with the leader for at least replica.lag.time.max.ms
period
- Followers send fetch requests to the leader at an interval of
replica.fetch.wait.max.ms which should be lower than replica.lag.time.max.ms
54. Apache Kafka – The Broker – The ISR
Broker 2
Partition 0
Leader
Broker 3Broker 1
Partition 0
Follower
Partition 0
Follower
Partition 1
Follower
Partition 1
Leader
Partition 1
Follower
Topic with:
- 2 partitions
- Replication
factor = 3
- replica.lag.time.
max.ms = 10000
- replica.lag.time.
max.ms = 500
-1 -8
0 -9
One message a second arriving at each partition.
Partitions in broker 3 lagging, but still within 10 second limit
0:01
0:00
0:08
0:09
55. Apache Kafka – The Broker – The ISR
Broker 2
Partition 0
Leader
Broker 3Broker 1
Partition 0
Follower
Partition 0
Follower
Partition 1
Follower
Partition 1
Leader
Partition 1
Follower
0 -22
-1 -19
Broker 3 seems to have an issue. It’s partitions have been out-of-sync
for more than 10 seconds and are no longer in the ISR
0:00
0:01
0:22
0:19
Topic with:
- 2 partitions
- Replication
factor = 3
- replica.lag.time.
max.ms = 10000
- replica.lag.time.
max.ms = 500
56. Apache Kafka – Low Latency, High Availability
Optimizing for Low Latency and High Availability
Acks = 1
unclean.leader.election.enable = true
57. Apache Kafka – Low Latency, High Availability
Broker 2
Partition 0
Leader
Broker 3Broker 1
Partition 0
Follower
Partition 0
Follower
Topic with:
- Replication
factor = 3
- replica.lag.time.
max.ms = 10000
- replica.lag.time.
max.ms = 500
- unclean.leader.
election.enable
= true
-1 -8
Producer sends a message and the leader persists the message,
then sends an ack.
0:01 0:08
Producer
1 message, acks = 1
Ack
+0 ms due to replicas
63. Apache Kafka – Data Safety, Higher Latency
Optimizing for Data Safety
(Increased Latency, Lower Availability)
acks = all
replication.factor = 3
min.insync.replicas = 2
a quorum (n+1)/2
71. Low # of Messages in Flight = Low Throughput, Low Message Duplication on Failure
Large # of Messages in Flight = High Throughput, High Message Duplication on Failure
1000 messages in flight
when consumer fails
Apache Kafka – Consumer Side Duplication
Fetch 1000 messages from same offset
25% of the messages processed, but
before offset is committed the
application fails
250 messages get processed a second
time by replacement application
Partition Consumer
Consumer
1 message in flight
when consumer fails
Fetch 1 message from same offset
Message is processed, but before
offset is committed the application
fails
1 message get processed a second
time by replacement consumer
Partition Consumer
Consumer
73. RabbitMQ Kafka
Fire-and-forget
Publisher Confirms
Availability During
Replica Synchronization
Fire-and-forget
Leader Only
All ISR
Producer
Side Idempotency
Tunable
Consistency
Vs Availability
Vs Latency
Vs Throughput
Configurable
Redundancy
Message
Acknowledgements
Replica Synchronization
Can Cause Unavailability
Consumer Side
Redelivered flag
Synchronous
Replication
Synchronous/
Asynchronous
Replication
74. Thank you!
Questions?
Jack Vanlightly
12 NOVEMBER 2018
London, UK
With keynotes and speakers from Goldman
Sachs, Pivotal, Wunderlist/Microsoft,
Erlang Solutions, CloudAMQP and more!
Get your EARLY BIRD TICKET + 10%
discount now!
Early Bird ends 31 August