Operations at a Given Average Throughput

A programming task I've often needed, especially while writing benchmarking code, is to perform an operation at a given number of times per second. For example, I want to see how a queue system performs under a load of 10 messages per second. To do this, I write a script to generate messages at the given rate. What is the correct pattern to ensure it runs at the correct rate, when the time to execute each iteration of the loop is variable, due to differences in how long it takes to construct the message or scheduling delays?

The examples in this post will use Python, but the pattern applies in any language.

A naive approach is to put a fixed delay in each iteration of our loop:

import time

MESSAGES_PER_SECOND = 10.0

while True:
    create_message()
    time.sleep(1.0/MESSAGES_PER_SECOND)

This will always execute slightly slower than desired, because it assumes that create_message() and the rest of the loop takes no time to execute. A slightly better approach is to time how long create_message() takes, and deduct it from the time to sleep:

while True:
    start_time = time.monotonic()
    create_message()
    end_time = time.monotonic()
    time.sleep(1.0/MESSAGES_PER_SECOND - (end_time - start_time))

This gets us closer, but still doesn't take into account the time to execute the loop itself, or operating system scheduling delays that happen outside of create_message(). The operating system could at any time decide that our process won't run for a moment, and if that moment is outside the start_time/end_time block, we can't account for it. Even if the delay does happen inside the timed block, what if the delay is longer than 1.0/MESSAGES_PER_SECOND? We can't sleep for a negative amount. Since we want to acheive an average throughput, the correct behavior would be to keep calling create_message() without any delay until we bring the average up to the desired value.

We need some kind of feedback system, that's is constantly tracking our progress without any gaps. The breakthrough comes when you keep track of the time elapsed that we expect to see assuming we're hitting our target, for example, for 10 tasks per second, we expect to see a time elapsed of 0.1 s, 0.2 s, 0.3s, etc. If we see a lower time, we're getting ahead of ourselves and need to sleep. If we see a higher time, we got slowed down and need to proceed without any delays until we're caught up. In code, it looks like this:

expected = time.monotonic()
while True:
    create_message()
    expected += 1.0/MESSAGES_PER_SECOND
    time.sleep(max(0, expected - time.monotonic()))

This code can be delayed at any point and still work, as long as the body of the loop is fast enough on average. Even if it's too slow on some iterations, we won't sleep at all until we're caught up.

Click the example below to see what the behavior looks like in practice.

See the Pen GMVmqG by Gavin (@gavinwahl) on CodePen.

You can see how most of the time there's about a 75 millisecond sleep, but when the loop takes longer than expected it runs several times with no delay to catch up.

Fusionbox is a premier custom Python development company with 100s of Python & Django projects under our belts since 2002. Contact us today to learn how we can help you on your Python project.

Return to Articles & Guides