How Request Coalescing works and a Case study from Slack.

This is a brilliant technique in handling database queries. It has been used by Discord and it saved them a ton of possible failures and downtimes. It helped them store trillion of messages and fetch them without bringing cluster down.

If multiple users are requesting for the same row, at the same time, then the database should be queried once. This is what request coalescing is all about. In a typical request coalescing setup, special data services are built. These data services are basically intermediary services that sit between the API layer and the database cluster. Below is an Image illustrating this.

The data services then implement the request coalescing. Below is what is happening under the hood.

  • The first user that makes a request causes a worker task to spin up in the data service.

  • Subsequent requests for the same data will check for the existence of that task and subscribe to it.

  • Once the initial worker task queries the database and gets the result, it will return the row to all subscribers at the same time.

Below gif illustrates the all process:

Request coalescing poses a stream of questions too. For instance:

  • How is request coalescing different from Caching?

With request coalescing, only one requester triggers the actual database query to the database. The rest just subscribes to it. If it was caching, all the requests would have hit the cache.

  • Why not use caching instead of request coalescing?

Both can actually be used as they don't operate the same way. Request coalescing is aimed at reducing the number of requests hitting the database.

  • How does it work internally?

Each worker has it's own local state, which is primarily just a HashMap storing requests and a list of senders waiting for the response. Whenever a response comes in, they remove the request from the HashMap and propagate the result to all the requesters waiting for the response.

  • Reasons why one should consider request coalescing.

Request coalescing should only be considered under extreme conditions such as Discord when the problems of requests became too much for the basic implementations.

Thanks for your time and see you in the next. I will be talking about Scaling cron Scripts and using Slack as a case study.

Credits to https://progressivecoder.beehiiv.com/p/how-request-coalescing-works