At Single, we are closely integrated with Shopify — our main offering is an app in the Shopify app store. This means that our system integrates frequently with Shopify’s API. We request access to users’ stores, publish products, add metadata to products, create charges for sales, and receive events for purchases and product updates — to name a few things.
Shopify’s API has a defense mechanism that many public-facing APIs have, namely, rate limiting. Rate limiting, as the name implies, is a way for servers to throttle the requests coming in from individual clients. This allows the servers to service a larger number of clients and makes the API less susceptible to attack from malicious or misconfigured clients. Exposing a non-rate-limited API to the public internet can be a recipe for disaster!
Leaky Bucket Rate Limiting
There are many different flavors of rate limiting out there, but they all aim to control the number of requests clients are allowed to make in any given amount of time. The rate-limiting algorithm implemented in Shopify’s API limits client requests to a certain amount per second while still allowing for short bursts of requests that exceed that limit. It’s called a “leaky bucket” rate limiter, which operates metaphorically how you might expect:
Imagine that each request a client sends to the API goes into a leaky bucket. Each client gets its own bucket. The only way client requests get to the server is through the leak in the bottom of the client’s bucket. No matter how sporadically requests come in from the client, requests leak out of the bucket at a constant rate of two requests per second. Thus, if requests from the client exceed the constant rate of the leak, the bucket starts to fill up. In this case, the bucket can only hold 40 requests, after which the bucket overflows and the client’s requests are rejected with an HTTP error status code: 429 Too Many Requests. The client then has to wait for the bucket to drain some more before making subsequent requests, or the server will continue to return error responses.

Dealing With Rate Limits the Lazy Way
Initially, we employed the laziest possible solution to dealing with Shopify’s API rate limits, which was to simply introduce arbitrary sleep periods into certain processes that were known to result in a high amount of API calls. For example, we have some batch processes that loop through a shop’s orders and do some processing on each one that requires several API calls to Shopify for each order. Inside that loop, we’d instruct the code to halt for a short interval after processing each order. This extremely naive approach worked most of the time, but broke down at others. We would still occasionally get 429 responses — particularly in situations where we had parallel processes accessing the API on behalf of a particular shop at the same time. These errors often left processes in weird states, some of which required manual intervention to fix — and manual intervention means work! We can’t have that.
Mirroring The Leaky Bucket
It became apparent that to solve this problem in the non-lazy way, we would need our system to keep track of how many requests were sent to Shopify for each shop, and how often. In other words, our application would need to maintain a mirrored version of the state of the shop’s metaphorical request bucket that matches the state of the bucket in Shopify’s system. Then, our system can know ahead-of-time if it is about to overflow the bucket, and can introduce precise delays to remain within limits while still performing as efficiently as possible. We needed to, in effect, rate limit ourselves — our outbound requests to Shopify before they leave our system.
Coordinating Our Distributed Services
While I wouldn’t call our system a microservice architecture, it is distributed enough that we have four distinct services that interact with the Shopify API:

As you can see, with multiple services interacting with Shopify’s API on a shop’s behalf, we have the potential for concurrently executing requests. This means that the request bucket state mentioned earlier has to be stored in a shared location external to the individual services.
We chose Redis to store this request count state information because it’s a fast, in-memory store, and — perhaps more conveniently — it was already in our ecosystem. To ensure that modifications to the state of a shop’s requests happen atomically, we also introduced an optimistic locking mechanism around the Shopify requests — also using Redis. This ensures that there’s ever only one request going out for a particular shop at any given time. To avoid duplicating our new, fancy Shopify client code across all the services that talk to Shopify, we pulled the client code into a library that is shared among the services. The landscape looks a little different now:

How It Works
Our Shopify library code essentially wraps all the HTTP requests and responses between Single’s services and Shopify’s API. It performs the rate limiting logic while maintaining state in Redis.
Request Interceptor
The first piece — the request interceptor — obtains a lock unique to the shop for whom the request is being made. Then it looks in Redis for two pieces of information (setting them to the starting defaults if they don’t already exist):
- The timestamp of the last request for the shop
- The number of remaining requests allowed before the shop’s bucket is full
Using those two pieces of information, it determines of the request bucket is full or not. If the bucket is full, it waits and retires the process until it’s no longer full. If the bucket is not full, it writes the updated states to Redis and lets the request proceed.
Response Interceptor
The response interceptor looks for a particular header from Shopify that tells us how many requests are left in the bucket. It obtains the same lock the request interceptor uses and writes the value of the shop’s remaining requests to Redis for use in the next request for that shop.
Conclusions
With this approach in place, we have eliminated essentially all of the 429 responses we were getting from Shopify’s rate-limited API. Furthermore, processes that had arbitrary sleep periods seem to perform more quickly now, because we’re following Shopify’s algorithm closely and we can squeeze out the best request rate possible for a given period of time. Conversely, concurrent operations happening against the same shop happen one-at-a-time, now. This incurs a time penalty that is totally acceptable to ensure consistent state in our services and to avoid errors that require manual intervention. So, if you’re interacting with any rate-limited APIs, don’t be lazy like me and use arbitrary sleeps. Match the rate limiting algorithm the API is using and store the state in a shared location for concurrent operations.
Acknowledgements
To give credit where credit’s due: We based much of the rate limit interceptor code and algorithm on the shopify-api-java-wrapper project. Our Redis-based distributed locking mechanism is a slightly-modified fork of the jedis-lock library.