API rate limiting
Configure concurrency
import { defer } from "@defer.run/client";
const importContacts = () => {
// ...
};
export default defer(importContacts, {
concurrency: 2,
});
Combine concurrency and retries for API rate limiting
The combination of retry
and concurrency
is perfect for fixed API rate
limiting.
For example, an askOpenAI()
background function is limited to 10 calls per
minute.
Knowing that askOpenAI()
runs, on average, for 1min, the following
configuration would be perfect to optimize the performance while matching the
OpenAI API rate limits:
import { defer } from "@defer.run/client";
const importContacts = () => {
// ...
};
export default defer(importContacts, {
concurrency: 10,
retry: 5,
});
The concurrency
ensures we don’t cross the 10 calls per minute with our ~1min
per call. On the other hand, the retry
option is here to recover some
scenarios where multiple sequential calls would execute under 1min.
Here is an easy rule of thumb to find your perfect concurrency
value for any
API rate limiting scenario:
concurrency = Math.floor(callsMaxPerMinute * AvgDurationInMinutes)
concurrency | callsMaxPerMinute | AvgDurationInMinutes |
---|---|---|
10 | 10 | 1 (1 min) |
5 | 10 | 0.5 (30 secs) |
0 | 10 | 0.08 (5 secs) |
The last line is a perfect example of the limitation of this approach.
Moreover, as many public 3rd party APIs rely on dynamic rate limiting using the
leaky bucket algorithm and other
algorithms, we will soon roll out a throttling
option to better configure rate
limiting.
Was this page helpful?