1. Y790 – Independent Study
Streaming Performances with Apache Kafka and RabbitMQ
Shameera Rathnayaka Yodage(syodage@indiana.edu)
Introduction
The demand for stream processing is increasing rapidly these days. It is not enough to process data
in big volumes. Data has to be processed fast so that users can identify the nature of data in real
time. This is required for fraud detection, trading, social network event processing and many more.
Source can be anything which publishes the changes of data in high rate. There can be more than
one data source, in that case application needs to handle all these event streams in real time.
Backend event processing component needs to process these data at same speed it comes. But the
reality is backend can not operate in the same speed. Event processing may require more time to
process the data and in the mean time data published will be added. In order to handle this high
rates of data event streams, data stream processing component needs to process these events in
parallel. Apache Kafka and RabbitMQ are two popular message broker implementations which
can be used to control real-time event streaming and processing with reliable way. These brokers
provide guaranteed delivery, means, every event will be delivered to backend when backend is
ready to process it. In this study, we have measured round trip latency of Apache Kafka and
RabbitMQ with different message sizes and compared the results.
Data Streaming
Data streaming is action of generating data continuously from large number of data sources, which
are typically sent simultaneously and data records are in small size. We can easily see this kind of
data sources widely used in industry. Twitter is one major data streaming application, which
generated large number of twitter record under millions of users. Another few applications are log
file generated by customers using web applications, ecommerce purchases, information from
social networks, financial trading floors, connected IoT devices. These data need to be processed
sequentially. Data stream processed by record by record or set of record using small time window.
To get near real time action according to the behavior of this data stream, stream processing
engines need to process these data as soon as it received. Stream processing technique is used by
wide verity of analytics including correlations, aggregations, filtering and sampling. Information
derived from such analysis gives companies more visibility of their business and help to take
important decisions without any delay. Every message comes with this data streams is valuable
and needs to process without losing. To achieve this, we need to use reliable message brokers in
between streaming data sources and stream processing engine.
Apache Kafka
Apache Kafka[1] is a distributed, partitioned, replicated commit log service, in another words
Kafka is a high-throughput distributed messaging system. Apache Kafka is an open source project
2. developed under Apache Software Foundation. Kafka is designed to allow a single cluster to serve
as central data backbone for a large organization. It can be elastically and transparently expanded
without downtime. Messages are persisted on disk and replicated within the cluster to prevent data
loss. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance
guarantee.
Kafka topic is feed name to which messages are published. Kafka maintain multiple partition for
a topic, and each partition can have multiple replications. This is how Kafka provide high fault-
tolerance to its data. Kafka recommend to setup partition count equal to the number of instances
in cluster and at least replication factor as 2 to provide fault-tolerance service.
Figure 1: Kafka Topic partitions
Apache Kafka use Apache Zookeeper [2] for store configurations and as a distributed coordinator
for its cluster. Kafka store all topics, partitions, replications, consumers, producers related
configurations on zookeeper. Figure 2 shows 3 node Kafka cluster with 3 partitions and replication
factor 2.
Kafka select one leader per partition and leader is the one who serve for consumers. Kafka
publisher can route messages to specific broker instance and there is not any introversion routing
tier. Client controls which partition it publishes messages to. This can be done at random,
implementing a kind of random load balancing in software level.
According to the Kafka design, Kafka consumer can start consuming messages from any location.
Either from latest or from previous offset. User can configure these through client topic
configuration. The Kafka consumer works by issuing fetch requests to the brokers leading the
portions it wants to consume. The consumer specifies its offset in the log with each request and
receives back a chunk of log beginning form that position. The consumer thus has significant
control over this consuming position and can rewind it to re-consume data if needed. This special
design enables to build a high fault-tolerance consumer framework. Kafka keep all messages
comes to the broker till retention time exceed for that topic. Once the retention time comes it delete
messages to free the disk space to new messages.
3. RabbitMQ
RabbitMQ [3] is a messaging broker which provides a common platform to send and receive
messages for applications. Also it provides a reliable storage for the messages till they are getting
delivered. RabbitMQ is designed to offer several important features as reliability, persistence,
guaranteed delivery and high availability in messaging.
RabbitMQ supports messaging over a variety of messaging protocols and is using AMQP 0.9.1
protocol. AMQP [4], The Advanced Message Queuing Protocol is an open standard for passing
business messages between applications or organizations.
Kafka Cluster
Broker 1
P1
R1
P3
R2
Broker 2
P2
R1
P1
R2
Broker 3
P3
R1
P2
R2
Producer
Consumer 1
Consumer 2
Consumer 3
Px
Ry
P3
R2
Leader of partition y and
replica number is x
Partition y and
replica number is x
Figure 2: Kafka Cluster
4. Figure 3: RabbitMQ Topic Routing
RabbitMQ's AMQP based messaging model is based on producer/subscriber, exchange, bindings
and queues/topics. The core idea in this messaging model is that the producer never sends any
messages directly to a specified queue.
Once published, the producer doesn't even know if a message will be delivered to any
queue. Instead, the producer only sends messages to an exchange. Messages are routed through
exchanges before arriving at queues. The exchange is like the middle man between producer and
queue/topic. It receives messages from producers and on the other side it forwards them to queues.
The exchange knows exactly what to do with a message it receives; whether it needs to be added
to a particular queue, to many queues or whether to discard the message. The rules for this selection
are defined by the exchange type. A binding is a relationship between an exchange and a queue.
In other words, binding implies the queue is interested in messages from this exchange. Once the
producer sends the message it is first added to exchange. Then based on which queue the message
is addressed to, a binding between exchange and specific queue is created and message is
transferred to queue through the binding. This is the RabbitMQ messaging model in brief.
RabbitMQ support high availability, this is similar to Apach Kafka replication factor. By default,
RabbitMQ queues located on a single node in a cluster. To archive fault tolerance, we need to alter
this default behavior and increase high availability factor to 2. RabbitMQ queues can optionally
be made mirrored across multiple nodes. By increasing high availability factor of cluster we say
RabbitMQ to make two mirror of one queue. Each mirrored queue consists of one master and one
5. or more slaves, which depend on HA factor. Oldest slave being promoted to the new master if the
old master disappears for any reason.
Deployment
Here we used hardware that is available at Indiana Universities Digital Science Center. We used
Juliet SuperMicro HPC Cluster to deploy all clusters and run all clients. Following is Juliet cluster
node configurations [5].
Juliet Compute Resource
System Type : SuperMicro HPC Cluster
# Nodes : 128
# CPUs : 256
# Cores : 3456
RaM(GB) : 16384
Storage(TB) : 1024 (HDD), 50 (SSD)
Node configuration
# CPUs : 48
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 2
Memory: 125G
SSD : 367G
Apache Kafka Cluster
Test has been run with 3 node Apache Kafka cluster with 3 node Apache Zookeeper cluster.
Apache Zookeeper recommends to run 3 node cluster to get robust behavior. Each Zookeeper node
runs on separate Juliet node and communicate via TCP. Each Kafka cluster node runs on separate
node on Juliet cluster. In test we used one Kafka producer and three partition consumer, one
consumer for one partition ( This is Apache Kafka recommendation). All these clients run on one
separate node in juliet cluster. Figure 4 show the Apache Kafka deployment on Juliet Cluster.
Apache Kafka : kafka_2.10-0.8.2.2
Apache Kafka Client : 0.8.2.0
Apache Zookeeper: 3.4.6
6. Figure 4: Apache Kafka Cluster Setup
RabbitMQ Cluster
Test has been run with 3 node RabbitMQ cluster with different configurations. Erlang is installed
on each machine and used RabbitMQ rpm download package available in RabbitMQ download
page. Each RabbitMQ is installed on different nodes on Juliet cluster. All producer and consumer
clients are runs on one different node on Juliet cluster.
RabbitMQ : 3.6.1
Erlang : 18.3
7. Figure 5: RabbitMQ Cluster Setup
Performance Comparison
All test cases run with two difference configurations, one with fault tolerance and another without
fault tolerance. In the first round we measure round trip latency with 3000 messages, here we ran
with different message sizes, from 8kb to 8MB. Figure 6 shows results with small message sizes
and Figure 7 shows results with small to large message sizes. For every test run, fresh topic was
created in Apache Kafka cluster and fresh vhost created on RabbitMQ cluster.
9. Table 1: Variance of Round Trip Latency
Kafka – rep 1 Kafka – rep 2 RabbitMQ – HA 1 RabbitMQ – HA 2
8k 3.687 0.431 0.571 0.415
16k 0.493 0.814 0.580 0.634
32k 0.615 1.063 0.610 0.634
64k 1.012 1.078 0.899 0.707
128k 0.923 1.004 1.200 0.889
256k 1.076 1.129 1.764 1.542
512k 1.284 1.328 2.834 2.738
1Mb 1.957 2.094 5.03 4.375
2Mb 4.622 14.29 9.799 8.979
4Mb 6.547 29.88 19.35 4.699
8Mb 23.853 107.786 8.205 37.02
As second round of test, we used the same configuration changes, but we give warm up time to
clusters with first 100 messages. Readings are taken after first 100 messages. Figure 8 show the
results for small messages with all four test configuration with Apache Kafka and RabbitMQ.
Figure 9 is same as Figure 8 but it has small to large messages.
Figure 8: Kafka vs RabbitMQ small message sizes
11. Conclusion
Without enabling fault-tolerance both Apache Kafka and RabbitMQ give the same level
performance of round trip latency (Figure 6 & 7). It is obvious that round trip latency is getting
increased with message size, as it requires to read and write more data from and to I/O devices
buffers. With enabling fault-tolerance factor by 2, Apache Kafka has slightly large round trip
latency than RabbitMQ. Fault-tolerance round trip latency of each broker is always grater when
compared to the non fault-tolerant round trip latency of the same broker. For small size messages
RabbitMQ shows almost same round trip latency for both with and without high availability. The
difference of fault tolerance round trip latency is getting increase with message size.
After warm up the cluster with hundred messages, there is slight performance improvement with
Apache Kafka results (Figure 8 & 9). Apache Kafka has high latency at startup and then it comes
down and cluster calibrate to the environment after first hundreds of messages. Still RabbitMQ
has the lowest round trip latency with both configurations. The fault-tolerance round trip latency
gap has been decrease as a result of warm up step.
According to the results, it seems there is no considerably big different with two broker but
RabbitMQ has given best readings for round trip latency compare to Apache Kafka for both fault-
tolerance and non fault tolerance configurations.
Future Works
Each broker can be configured differently to get more performance according to the nature of
application. In this performance test mostly default properties comes with each broker and few
properties are changed to match with testing environment. Here we used one producer, but it is
more interesting to know the break point of both brokers with high load. We can run same test
with increasing the number of publishers/producers and check the stability of both clusters with
load. Further we can run this load test with increasing fault tolerance factor of each broker and
checking the latency changes.
Reference:
[1] http://kafka.apache.org
[2] https://zookeeper.apache.org
[3] https://www.rabbitmq.com
[4] https://www.amqp.org
[5] http://cloudmesh.github.io/introduction_to_cloud_computing/hardware/indiana.html