Technical deep-dive: Dynamic queue size optimization
In this series, we ask our developers to provide insights into features and optimizations that are not very visible on the surface, but have a lot of impact under the hood. For this edition, one of our engineers is going to talk about a new feature that automatically determines the optimal in-memory queue size for specific IPs and domain groups.
We’ve designed MailerQ in a way that nearly everything can be configured and dynamically rerouted during almost every stage of the application. This level of configurability has a lot of advantages, but it also results in people frequently asking us for optimal setting values, and for most settings and workloads, it depends. We can usually give you a reasonable default but this whole process is very manual, with a good amount of measuring and even some guesswork.
One of the settings that can have a big impact on performance that we will discuss today is queue size. This setting can be set on Email Throttles and Flood Patterns and is part of the advanced settings. Recently, since the addition of the deep queue assignment, this setting has become a lot harder to estimate since it is also propagated to the pool and MTA IP level for a domain.
What does the queue size setting for in-memory queues entail?
Basically, the queue size setting specifies the number of messages that MailerQ may have in memory waiting for a certain domain group. This in-memory queue only includes the messages that are waiting to be assigned to a new or existing SMTP connection, and not the messages that have already been assigned to one. MailerQ does nothing with the messages in the in-memory queue until it decides it should actually send these messages. After the throttles have been evaluated and other messages have been sent, at some point, MailerQ will grab the best message from this queue to send right now, which is the next message that will be sent over a connection.
The problem
This can be a problem because there is an overall limit of how many messages there may be in memory at the same time, which is given by the "RabbitMQ quality of service" setting. As soon as this limit is hit, RabbitMQ will stop sending more messages to MailerQ, which can result in the outbox queue filling up. This is very undesirable since this means that all messages from that point will pile up in RabbitMQ, without MailerQ even seeing them! In this scenario, unrelated messages which could potentially be sent directly are unnecessarily waiting in the queue, because some receiver cannot keep up.
A simple solution to this problem is simply setting the queue size very low; messages are no longer queued and only obtained when they can actually be sent. This is a good approach to a low memory footprint, but can be detrimental for large receivers in terms of throughput. Because MailerQ can now not keep the messages at hand, it can only get new messages for the domain once the old ones have been sent. While this explanation is simplified, the described problem does create a large bottleneck on the pipe, even though the MTA can send a lot more and the receiver can receive a lot more!
The solution: dynamic queue size optimization
Since MailerQ 5.11, this all is no longer an issue. The queue size can still be capped to a small number to keep memory usage lower, but MailerQ will now dynamically optimize this number. If more messages can be sent to a domain, it gets a higher allowance for the number of messages in memory, and for domains with a stricter receiving policy, it gets a lower allowance. This has two advantages; the in-memory buffer grows to accommodate what can actually be sent, and if MailerQ is suddenly throttled during sending, it will rapidly lower the number of messages in memory to make room for other domains. That way, consuming from the outbox will never stop (and thus it should never fill up), yet there is no performance overhead from not having enough messages in memory! This way, you can set up MailerQ to the maximum queue size you would allow, only considering memory instead of all other variables that previously also needed to be considered, making it a lot simpler to configure and set up.
Naturally, this perfectly combines with the deep queue assignments, and this gives MailerQ more freedom to optimize overall application throughput while ensuring that the outbox is still processed, making sure that messages will no longer be waiting on messages sent to unrelated domains, or via unrelated pools and MTA IPs. That way you can be sure that MailerQ will never ‘hang’ because one domain is using up all the resources, and that everything will get sent as fast as possible.
Want to try out this and other improvements? Upgrade now to MailerQ 5.11 or later!