MongoDB- A case Study of Database Sharding.
One might think that Database sharding is simple based on the descriptions given. However, it is not as simple as it sounds. Below is a real-world case study of how Databases do Database Sharding.
MongoDB Basics.
For sharding in MongoDB, we need clusters. A cluster is a group of interconnected servers or nodes. To scale horizontally, the number of servers can only be increased.
The cluster is made up of three components:
Shard.
Mongos Router.
Config routers.
The shard.
This is a subset of data. Data is divided between a group of shards and each shard is deployed as a Replica set. This is a great thing since Replication and Automated Failover are provided out of the box and no direct query requests are made to the shards.
The Mongos Router.
This plays a key role in a cluster. All queries are sent to the Mongos Router. It performs two critical activities:
Query routing and load balancing.
Metadata caching.
The Router acts as a middleman to fetch data from the actual shards.
Config Servers.
This runs as a separate replica set. They store the metadata for the MongoDB sharded cluster.
Metadata is like the index for your cluster. It stores the information such as:
How the data is organized?
What components are present in the cluster?
The router needs the Config server for the data.
Below is a snapshot of what the whole process looks like:
Application code queries the data.
The Mongos Router receives the Query.
The Router checks the config Server to find which shard has the data.
The Query is directed to the appropriate shard.
Data is returned to the Application.
Thank you for your time and see you in the next.
Credits to @ProgressiveCod2 on X(Formerly Twitter).