SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
What we've learned from running thousands
of production RabbitMQ clusters
Lovisa Johansson
lovisa@cloudamqp.com
3000 emails
● Unstable RabbitMQ version
● Unoptimized configuration for a specific use case
➢ High availability
➢ High Performance
● Users (you?) are using RabbitMQ in a bad way
● Client libraries are using RabbitMQ in bad way
● Things are not done in an optimal way
● Customer use cases
● Configuration mistakes
● Common mistakes
Client side problems
Server side problems
What we've learned from running
thousands of production RabbitMQ
clusters
Lovisa Johansson
Marketing Manager
Support Engineer
RabbitMQ Engineer
Umeå, Sweden
23000 running instances 7 clouds
Largest provider of managed RabbitMQ servers
75 regions
Headquarter
Stockholm Sweden
Don’t use too many connections or channels
● Keep connection/channel count low
● Each connection uses about 100 KB of RAM
● Thousands of connections can be a heavy burden on a RabbitMQ server
● Channel and connections leaks are among the most common errors that we see
Recommendation number 1.
CONNECTIONS AND CHANNELS
● Long-lived connections.
● Don’t open a channel every
time you are publishing
Don’t open and close connections or channels repeatedly
● AMQP connections: 7 TCP packages
● AMQP channel: 2 TCP packages
● AMQP publish: 1 TCP package
● AMQP close channel: 2 TCP packages
● AMQP close connection: 2 TCP packages
Total 14-19 packages (+ acks)
Recommendation number 2.
CONNECTIONS AND CHANNELS
● Our benchmarks show that the proxy is increasing publishing
speed with a magnitude or more.
● https://github.com/cloudamqp/amqproxy
● Some clients can’t keep long-lived connections
(looking at you PHP )
● Avoid connection churn by using a proxy that pools
connections and channels for reuse.
AMQProxy
Flow control: Might not be able to consume if the connection is in flow control
Back pressure: RabbitMQ can apply back pressure on the TCP connection when the
publisher is sending too many messages
Separate connections for publishers and consumers
Recommendation number 3.
CONNECTIONS AND CHANNELS
● Less than 10 000 messages in one queue
● Heavy load on RAM usage
QUEUES
Recommendation number 4.
Don't have too large queues
○ In order to free up RAM, RabbitMQ starts page out messages to disk
○ Blocks the queue from processing messages
● Time-consuming to restart a cluster
● Limit queue size with TTL or max-length
● Lazy queues was added in RabbitMQ 3.6
● Writes messages to disk immediately, thus spreading the work out over time instead of taking the
risk of a performance hit somewhere down the road
● More predictable and smooth performance curve
○ Messages are only loaded into memory when they are needed.
Enable lazy queues to get predictable performance
Recommendation number 5.
QUEUES
Enable lazy queues if…
● the publisher is sending many messages at once
● the consumers are not keeping up with the speed of the publishers all the time
Ignore lazy queues if..
● you require high performance
● queues are always short
The RabbitMQ management collects and calculates metrics for every queue, connection,
and channel in the cluster
● Slows down the server if you have thousands upon thousands of active queues or
consumers
Don’t set RabbitMQ Management statistics rate mode to detailed
Recommendation number 6.
QUEUES
Split queues over different cores, and route messages to multiple
queues
Recommendation number 7.1
QUEUES
● A queue is single threaded
○ 50k messages/s
● Queue performance is limited to one CPU core.
● All messages routed to a specific queue will end up
on the node where that queue resides.
Plugins
The consistent hash
exchange plugin
RabbitMQ sharding
Recommendation number 7.2
QUEUES
● Load-balance messages between queues
● Messages are consistently and equally distributed across many queues
● Consume from all queues
● https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange
The consistent hash exchange plugin
Recommendation number 7.3
QUEUES
RabbitMQ sharding
● Automatic partitioning of queues
● Queues are created on every cluster node and messages are sharded across them
● Shows one queue to the consumer, but it could be many queues running behind it in
the background
● https://github.com/rabbitmq/rabbitmq-sharding
Recommendation number 8.
QUEUES
Have limited use on priority queues
● Each priority level uses an internal queue on the Erlang VM, which takes up
resources.
● In most use cases it's sufficient to have no more than 5 priority levels.
Recommendation number 9.
QUEUES
Send persistent messages and durable queues
● Messages, exchanges, and queues that are not durable and persistent are lost
during a broker restart
● High performance - use transit messages and temporary, or non-durable queues
Recommendation number 10.1
PREFETCH
Adjust prefetch value
● Limits how many messages the client can receive before acknowledging a message
● RabbitMQ default prefetch value - unlimited buffer
● RabbitMQ 3.7
○ Option to adjust the default prefetch
○ CloudAMQP servers has a default prefetch of 1000
Recommendation number 10.2
PREFETCH
Prefetch - Too small prefetch value
RabbitMQ is most of the
time waiting to get
permission to send more
messages
Recommendation number 10.3
PREFETCH
Prefetch - Too large prefetch value
Recommendation number 10.4
PREFETCH
Prefetch
● One single or few consumers with short processing time
○ prefetch many messages at once
● About the same processing time and a stable network
○ estimated prefetch value by using the total round trip time divided by
processing time on the client for each message
● Many consumers, and short processing time
○ A lower prefetch value than for one single or few consumers
● Many consumers, and/or long processing time
○ Set prefetch count to 1 so that messages are evenly distributed among all
your workers
● The prefetch value have no effect if your client auto-ack messages
Recommendation number 11.
HiPE
HiPE
● HiPE increases server throughput at the cost of increased start-up time
○ increases throughput with 20-80%
○ increases start-up time about 1 to 3 minutes
● HiPE is recommended if you require high availability
● We don’t consider HiPE as experimental any longer
● Pay attention to where in your consumer logic you’re acknowledging messages
● For the fastest possible throughput, manual acks should be disabled
● Publish confirm is required if the publisher needs messages to be processed at
least once
Recommendation number 12.
ACKS AND CONFIRMS
Acknowledgments and Confirms
Great improvements are made to RabbitMQ, all the time <3
● 3.7
○ Default prefetch
○ Individual vhost message stores
● 3.6
○ Lots of many memory problems, up to version 3.6.14
○ Lazy queues
● 3.5
○ Still may customers on 3.5.7
Recommendation number 13.
VERSION
Use a stable RabbitMQ version
Back compatibility is
really good in RabbitMQ
● Some plugins are consuming lots of resources
● Make sure to disable plugins that you are not using
Recommendation number 14.
Plugins
Disable plugins you are not using
● Unused queues take up some resources, queue index, management statistics etc
● Temporary queues should be auto deleted
Recommendation number 15.
Unused queues
Delete unused queues
● Message loss on netsplits
● Needed to be able to upgrade without losing messages at CloudAMQP
Recommendation number 16.
VHOST
Enable HA-vhost policy on custom vhosts
Summary Overall
Server side problems
● Short queues
● Long lived connections
● Limited use of priority queues
● Use multiple queues and consumers
● Split your queues over different cores
● Stable Erlang and RabbitMQ version
● Disable plugins you are not using
● Channels on all connections
Summary Overall
Server side problems
● Separate connections for publishers and
consumers
● Management statistics rate mode
● Delete unused queues
● Temporary queues should be auto deleted
Summary High Performance
Server side problems
● Short queues
○ max-length if possible
● Do not use lazy queues
● Send transit messages
● Disable manual acks and publish
confirms
● Avoid multiple nodes (HA)
● Enable RabbitMQ HiPE
Summary High Availability
Server side problems
● Enable lazy queues
● RabbitMQ HA - 2 nodes
○ HA-policy on all vhosts
● Persistent messages, durable queues
● Do not enable HiPE
DIAGNOSTIC TOOL
DIAGNOSTIC TOOL
Diagnostics Tool
● RabbitMQ and Erlang version
● Queue length
● Unused queues
● Persistent messages in durable queues
● No mirrored auto delete queues
● Limited use of priority queues
● Long lived connections
● Connection and channel leak
● Channels on all connections
● Insecure connections
● Client library
● AMQP Heartbeats
● Channel prefetch
● Limited use of priority queues
● Management statistics rate mode
● Ensure that you are not using topic exchange as fanout
● Ensure that all published messages are routed
● Ensure that you have a HA-policy on all vhosts
● Auto delete on temporary queues
● Persistent messages in durable queues
● No transient messages in mirrored queues
● No mirrored auto delete queues
● Separate connections for publishers and consumers
It should be easier to do things
right!
Questions?
Visit www.cloudamqp.com blog site, documentation and FAQ for more info

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

RabbitMQ.ppt
RabbitMQ.pptRabbitMQ.ppt
RabbitMQ.ppt
 
RabbitMQ
RabbitMQRabbitMQ
RabbitMQ
 
Rabbitmq an amqp message broker
Rabbitmq an amqp message brokerRabbitmq an amqp message broker
Rabbitmq an amqp message broker
 
Message Broker System and RabbitMQ
Message Broker System and RabbitMQMessage Broker System and RabbitMQ
Message Broker System and RabbitMQ
 
Architecture | The Future of Messaging: RabbitMQ and AMQP | Eberhard Wolff
Architecture | The Future of Messaging: RabbitMQ and AMQP | Eberhard WolffArchitecture | The Future of Messaging: RabbitMQ and AMQP | Eberhard Wolff
Architecture | The Future of Messaging: RabbitMQ and AMQP | Eberhard Wolff
 
[@NaukriEngineering] Messaging Queues
[@NaukriEngineering] Messaging Queues[@NaukriEngineering] Messaging Queues
[@NaukriEngineering] Messaging Queues
 
What is RabbitMQ ?
What is RabbitMQ ?What is RabbitMQ ?
What is RabbitMQ ?
 
Easy enterprise application integration with RabbitMQ and AMQP
Easy enterprise application integration with RabbitMQ and AMQPEasy enterprise application integration with RabbitMQ and AMQP
Easy enterprise application integration with RabbitMQ and AMQP
 
Rabbitmq & Kafka Presentation
Rabbitmq & Kafka PresentationRabbitmq & Kafka Presentation
Rabbitmq & Kafka Presentation
 
Theres a rabbit on my symfony
Theres a rabbit on my symfonyTheres a rabbit on my symfony
Theres a rabbit on my symfony
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
Amqp Basic
Amqp BasicAmqp Basic
Amqp Basic
 
10 Things Every Developer Using RabbitMQ Should Know
10 Things Every Developer Using RabbitMQ Should Know10 Things Every Developer Using RabbitMQ Should Know
10 Things Every Developer Using RabbitMQ Should Know
 
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker ContainersKafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
Kafka Summit SF 2017 - Best Practices for Running Kafka on Docker Containers
 
AMQP
AMQPAMQP
AMQP
 
Windows Communication Foundation (WCF)
Windows Communication Foundation (WCF)Windows Communication Foundation (WCF)
Windows Communication Foundation (WCF)
 
NServiceBus - building a distributed system based on a messaging infrastructure
NServiceBus - building a distributed system based on a messaging infrastructureNServiceBus - building a distributed system based on a messaging infrastructure
NServiceBus - building a distributed system based on a messaging infrastructure
 
Backup using rsync
Backup using rsyncBackup using rsync
Backup using rsync
 
Dual write strategies for microservices
Dual write strategies for microservicesDual write strategies for microservices
Dual write strategies for microservices
 
SMTP Simple Mail Transfer Protocol
SMTP Simple Mail Transfer ProtocolSMTP Simple Mail Transfer Protocol
SMTP Simple Mail Transfer Protocol
 

Ähnlich wie What we've learned from running thousands of production RabbitMQ clusters - Lovisa Johansson

Towards Improved Data Dissemination of Publish-Subscribe Systems
Towards Improved Data Dissemination of Publish-Subscribe SystemsTowards Improved Data Dissemination of Publish-Subscribe Systems
Towards Improved Data Dissemination of Publish-Subscribe Systems
Srinath Perera
 
Enterprise Messaging with RabbitMQ.pdf
Enterprise Messaging with RabbitMQ.pdfEnterprise Messaging with RabbitMQ.pdf
Enterprise Messaging with RabbitMQ.pdf
Ortus Solutions, Corp
 

Ähnlich wie What we've learned from running thousands of production RabbitMQ clusters - Lovisa Johansson (20)

Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018Non-Kafkaesque Apache Kafka - Yottabyte 2018
Non-Kafkaesque Apache Kafka - Yottabyte 2018
 
AMQP with RabbitMQ
AMQP with RabbitMQAMQP with RabbitMQ
AMQP with RabbitMQ
 
Towards Improved Data Dissemination of Publish-Subscribe Systems
Towards Improved Data Dissemination of Publish-Subscribe SystemsTowards Improved Data Dissemination of Publish-Subscribe Systems
Towards Improved Data Dissemination of Publish-Subscribe Systems
 
Apache Kafka - Free Friday
Apache Kafka - Free FridayApache Kafka - Free Friday
Apache Kafka - Free Friday
 
Working with Asynchronous Events
Working with Asynchronous EventsWorking with Asynchronous Events
Working with Asynchronous Events
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Pika driver
Pika driverPika driver
Pika driver
 
Building zero data loss pipelines with apache kafka
Building zero data loss pipelines with apache kafkaBuilding zero data loss pipelines with apache kafka
Building zero data loss pipelines with apache kafka
 
kafka
kafkakafka
kafka
 
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017
Advanced OpenVPN Concepts on pfSense 2.4 & 2.3.3 - pfSense Hangout February 2017
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
 
Enterprise Messaging with RabbitMQ.pdf
Enterprise Messaging with RabbitMQ.pdfEnterprise Messaging with RabbitMQ.pdf
Enterprise Messaging with RabbitMQ.pdf
 
Irc how to sept 2012
Irc how to   sept 2012Irc how to   sept 2012
Irc how to sept 2012
 
#4 Mulesoft Virtual Meetup Kolkata December 2020
#4 Mulesoft Virtual Meetup Kolkata December 2020#4 Mulesoft Virtual Meetup Kolkata December 2020
#4 Mulesoft Virtual Meetup Kolkata December 2020
 
Advanced OpenVPN Concepts - pfSense Hangout September 2014
Advanced OpenVPN Concepts - pfSense Hangout September 2014Advanced OpenVPN Concepts - pfSense Hangout September 2014
Advanced OpenVPN Concepts - pfSense Hangout September 2014
 
BAXTER phase 1b
BAXTER phase 1bBAXTER phase 1b
BAXTER phase 1b
 
Improving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at UberImproving Kafka at-least-once performance at Uber
Improving Kafka at-least-once performance at Uber
 
Reducing load with RabbitMQ
Reducing load with RabbitMQReducing load with RabbitMQ
Reducing load with RabbitMQ
 
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
KubeCon + CloudNative Con NA 2021 | A New Generation of NATSKubeCon + CloudNative Con NA 2021 | A New Generation of NATS
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Kürzlich hochgeladen (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

What we've learned from running thousands of production RabbitMQ clusters - Lovisa Johansson

  • 1. What we've learned from running thousands of production RabbitMQ clusters Lovisa Johansson lovisa@cloudamqp.com
  • 2.
  • 3.
  • 5. ● Unstable RabbitMQ version ● Unoptimized configuration for a specific use case ➢ High availability ➢ High Performance ● Users (you?) are using RabbitMQ in a bad way ● Client libraries are using RabbitMQ in bad way ● Things are not done in an optimal way ● Customer use cases ● Configuration mistakes ● Common mistakes Client side problems Server side problems
  • 6. What we've learned from running thousands of production RabbitMQ clusters
  • 7. Lovisa Johansson Marketing Manager Support Engineer RabbitMQ Engineer Umeå, Sweden
  • 8. 23000 running instances 7 clouds Largest provider of managed RabbitMQ servers 75 regions Headquarter Stockholm Sweden
  • 9. Don’t use too many connections or channels ● Keep connection/channel count low ● Each connection uses about 100 KB of RAM ● Thousands of connections can be a heavy burden on a RabbitMQ server ● Channel and connections leaks are among the most common errors that we see Recommendation number 1. CONNECTIONS AND CHANNELS
  • 10. ● Long-lived connections. ● Don’t open a channel every time you are publishing Don’t open and close connections or channels repeatedly ● AMQP connections: 7 TCP packages ● AMQP channel: 2 TCP packages ● AMQP publish: 1 TCP package ● AMQP close channel: 2 TCP packages ● AMQP close connection: 2 TCP packages Total 14-19 packages (+ acks) Recommendation number 2. CONNECTIONS AND CHANNELS
  • 11. ● Our benchmarks show that the proxy is increasing publishing speed with a magnitude or more. ● https://github.com/cloudamqp/amqproxy ● Some clients can’t keep long-lived connections (looking at you PHP ) ● Avoid connection churn by using a proxy that pools connections and channels for reuse. AMQProxy
  • 12. Flow control: Might not be able to consume if the connection is in flow control Back pressure: RabbitMQ can apply back pressure on the TCP connection when the publisher is sending too many messages Separate connections for publishers and consumers Recommendation number 3. CONNECTIONS AND CHANNELS
  • 13.
  • 14. ● Less than 10 000 messages in one queue ● Heavy load on RAM usage QUEUES Recommendation number 4. Don't have too large queues ○ In order to free up RAM, RabbitMQ starts page out messages to disk ○ Blocks the queue from processing messages ● Time-consuming to restart a cluster ● Limit queue size with TTL or max-length
  • 15. ● Lazy queues was added in RabbitMQ 3.6 ● Writes messages to disk immediately, thus spreading the work out over time instead of taking the risk of a performance hit somewhere down the road ● More predictable and smooth performance curve ○ Messages are only loaded into memory when they are needed. Enable lazy queues to get predictable performance Recommendation number 5. QUEUES Enable lazy queues if… ● the publisher is sending many messages at once ● the consumers are not keeping up with the speed of the publishers all the time Ignore lazy queues if.. ● you require high performance ● queues are always short
  • 16. The RabbitMQ management collects and calculates metrics for every queue, connection, and channel in the cluster ● Slows down the server if you have thousands upon thousands of active queues or consumers Don’t set RabbitMQ Management statistics rate mode to detailed Recommendation number 6. QUEUES
  • 17. Split queues over different cores, and route messages to multiple queues Recommendation number 7.1 QUEUES ● A queue is single threaded ○ 50k messages/s ● Queue performance is limited to one CPU core. ● All messages routed to a specific queue will end up on the node where that queue resides. Plugins The consistent hash exchange plugin RabbitMQ sharding
  • 18. Recommendation number 7.2 QUEUES ● Load-balance messages between queues ● Messages are consistently and equally distributed across many queues ● Consume from all queues ● https://github.com/rabbitmq/rabbitmq-consistent-hash-exchange The consistent hash exchange plugin
  • 19. Recommendation number 7.3 QUEUES RabbitMQ sharding ● Automatic partitioning of queues ● Queues are created on every cluster node and messages are sharded across them ● Shows one queue to the consumer, but it could be many queues running behind it in the background ● https://github.com/rabbitmq/rabbitmq-sharding
  • 20. Recommendation number 8. QUEUES Have limited use on priority queues ● Each priority level uses an internal queue on the Erlang VM, which takes up resources. ● In most use cases it's sufficient to have no more than 5 priority levels.
  • 21. Recommendation number 9. QUEUES Send persistent messages and durable queues ● Messages, exchanges, and queues that are not durable and persistent are lost during a broker restart ● High performance - use transit messages and temporary, or non-durable queues
  • 22. Recommendation number 10.1 PREFETCH Adjust prefetch value ● Limits how many messages the client can receive before acknowledging a message ● RabbitMQ default prefetch value - unlimited buffer ● RabbitMQ 3.7 ○ Option to adjust the default prefetch ○ CloudAMQP servers has a default prefetch of 1000
  • 23. Recommendation number 10.2 PREFETCH Prefetch - Too small prefetch value RabbitMQ is most of the time waiting to get permission to send more messages
  • 24. Recommendation number 10.3 PREFETCH Prefetch - Too large prefetch value
  • 25. Recommendation number 10.4 PREFETCH Prefetch ● One single or few consumers with short processing time ○ prefetch many messages at once ● About the same processing time and a stable network ○ estimated prefetch value by using the total round trip time divided by processing time on the client for each message ● Many consumers, and short processing time ○ A lower prefetch value than for one single or few consumers ● Many consumers, and/or long processing time ○ Set prefetch count to 1 so that messages are evenly distributed among all your workers ● The prefetch value have no effect if your client auto-ack messages
  • 26. Recommendation number 11. HiPE HiPE ● HiPE increases server throughput at the cost of increased start-up time ○ increases throughput with 20-80% ○ increases start-up time about 1 to 3 minutes ● HiPE is recommended if you require high availability ● We don’t consider HiPE as experimental any longer
  • 27. ● Pay attention to where in your consumer logic you’re acknowledging messages ● For the fastest possible throughput, manual acks should be disabled ● Publish confirm is required if the publisher needs messages to be processed at least once Recommendation number 12. ACKS AND CONFIRMS Acknowledgments and Confirms
  • 28. Great improvements are made to RabbitMQ, all the time <3 ● 3.7 ○ Default prefetch ○ Individual vhost message stores ● 3.6 ○ Lots of many memory problems, up to version 3.6.14 ○ Lazy queues ● 3.5 ○ Still may customers on 3.5.7 Recommendation number 13. VERSION Use a stable RabbitMQ version Back compatibility is really good in RabbitMQ
  • 29. ● Some plugins are consuming lots of resources ● Make sure to disable plugins that you are not using Recommendation number 14. Plugins Disable plugins you are not using
  • 30. ● Unused queues take up some resources, queue index, management statistics etc ● Temporary queues should be auto deleted Recommendation number 15. Unused queues Delete unused queues
  • 31. ● Message loss on netsplits ● Needed to be able to upgrade without losing messages at CloudAMQP Recommendation number 16. VHOST Enable HA-vhost policy on custom vhosts
  • 32. Summary Overall Server side problems ● Short queues ● Long lived connections ● Limited use of priority queues ● Use multiple queues and consumers ● Split your queues over different cores ● Stable Erlang and RabbitMQ version ● Disable plugins you are not using ● Channels on all connections
  • 33. Summary Overall Server side problems ● Separate connections for publishers and consumers ● Management statistics rate mode ● Delete unused queues ● Temporary queues should be auto deleted
  • 34. Summary High Performance Server side problems ● Short queues ○ max-length if possible ● Do not use lazy queues ● Send transit messages ● Disable manual acks and publish confirms ● Avoid multiple nodes (HA) ● Enable RabbitMQ HiPE
  • 35. Summary High Availability Server side problems ● Enable lazy queues ● RabbitMQ HA - 2 nodes ○ HA-policy on all vhosts ● Persistent messages, durable queues ● Do not enable HiPE
  • 37. DIAGNOSTIC TOOL Diagnostics Tool ● RabbitMQ and Erlang version ● Queue length ● Unused queues ● Persistent messages in durable queues ● No mirrored auto delete queues ● Limited use of priority queues ● Long lived connections ● Connection and channel leak ● Channels on all connections ● Insecure connections ● Client library ● AMQP Heartbeats ● Channel prefetch ● Limited use of priority queues ● Management statistics rate mode ● Ensure that you are not using topic exchange as fanout ● Ensure that all published messages are routed ● Ensure that you have a HA-policy on all vhosts ● Auto delete on temporary queues ● Persistent messages in durable queues ● No transient messages in mirrored queues ● No mirrored auto delete queues ● Separate connections for publishers and consumers
  • 38. It should be easier to do things right!
  • 39. Questions? Visit www.cloudamqp.com blog site, documentation and FAQ for more info