MongoDB- A case Study of Database Sharding.

One might think that Database sharding is simple based on the descriptions given. However, it is not as simple as it sounds. Below is a real-world case study of how Databases do Database Sharding.

MongoDB Basics.

What is MongoDB - Hyperskill

For sharding in MongoDB, we need clusters. A cluster is a group of interconnected servers or nodes. To scale horizontally, the number of servers can only be increased.

The cluster is made up of three components:

  • Shard.

  • Mongos Router.

  • Config routers.

The shard.

This is a subset of data. Data is divided between a group of shards and each shard is deployed as a Replica set. This is a great thing since Replication and Automated Failover are provided out of the box and no direct query requests are made to the shards.

The Mongos Router.

This plays a key role in a cluster. All queries are sent to the Mongos Router. It performs two critical activities:

  • Query routing and load balancing.

  • Metadata caching.

The Router acts as a middleman to fetch data from the actual shards.

Config Servers.

This runs as a separate replica set. They store the metadata for the MongoDB sharded cluster.

Metadata is like the index for your cluster. It stores the information such as:

  • How the data is organized?

  • What components are present in the cluster?

The router needs the Config server for the data.

Below is a snapshot of what the whole process looks like:

  • Application code queries the data.

  • The Mongos Router receives the Query.

  • The Router checks the config Server to find which shard has the data.

  • The Query is directed to the appropriate shard.

  • Data is returned to the Application.

Thank you for your time and see you in the next.

Credits to @ProgressiveCod2 on X(Formerly Twitter).