This document describes requirements, organization and design of sharded cluster deployments.
A sharded cluster has the following components:
Three config servers.
For development and testing purposes you may deploy a cluster with a single configuration server process, but always use exactly three config servers for redundancy and safety in production.
Two or more shards. Each shard consists of one or more mongod instances that store the data for the shard.
These “normal” mongod instances hold all of the actual data for the cluster.
MongoDB enables data partitioning, or sharding, on a per collection basis. You must access all data in a sharded cluster via the mongos instances as below. If you connect directly to a mongod in a sharded cluster you will see its fraction of the cluster’s data. The data on any given shard may be somewhat random: MongoDB provides no guarantee that any two contiguous chunks will reside on a single shard.
One or more mongos instances.
These instance direct queries from the application layer to the shards that hold the data. The mongos instances have no persistent state or data files and only cache metadata in RAM from the config servers.
In most situations mongos instances use minimal resources, and you can run them on your application servers without impacting application performance. However, if you use the aggregation framework some processing may occur on the mongos instances, causing that mongos to require more system resources.
Your cluster must manage a significant quantity of data for sharding to have an effect on your collection. The default chunk size is 64 megabytes, and the balancer will not begin moving data until the imbalance of chunks in the cluster exceeds the migration threshold.
Practically, this means that unless your cluster has many hundreds of megabytes of data, chunks will remain on a single shard.
While there are some exceptional situations where you may need to shard a small collection of data, most of the time the additional complexity added by sharding the small collection is not worth the additional complexity and overhead unless you need additional concurrency or capacity for some reason. If you have a small data set, usually a properly configured single MongoDB instance or replica set will be more than sufficient for your persistence layer needs.
Chunk size is user configurable. However, the default value is of 64 megabytes is ideal for most deployments. See the Chunk Size section in the Sharded Cluster Internals document for more information.
Because all components of a sharded cluster must communicate with each other over the network, there are special restrictions regarding the use of localhost addresses:
If you use either “localhost” or “127.0.0.1” as the host identifier, then you must use “localhost” or “127.0.0.1” for all host settings for any MongoDB instances in the cluster. This applies to both the host argument to addShard and the value to the mongos --configdb run time option. If you mix localhost addresses with remote host address, MongoDB will produce errors.
You can deploy a very minimal cluster for testing and development. These non-production clusters have the following components:
Use the test cluster architecture for testing and development only.
In a production cluster, you must ensure that data is redundant and that your systems are highly available. To that end, a production-level cluster must have the following components:
Three config servers, each residing on a discrete system.
Two or more mongos instances. Typically, you deploy a single mongos instance on each application server. Alternatively, you may deploy several mongos nodes and let your application connect to these via a load balancer.
Sharding operates on the collection level. You can shard multiple collections within a database, or have multiple databases with sharding enabled.  However, in production deployments some databases and collections will use sharding, while other databases and collections will only reside on a single database instance or replica set (i.e. a shard.)
Regardless of the data architecture of your sharded cluster, ensure that all queries and operations use the mongos router to access the data cluster. Use the mongos even for operations that do not impact the sharded data.
Every database has a “primary”  shard that holds all un-sharded collections in that database. All collections that are not sharded reside on the primary for their database. Use the movePrimary command to change the primary shard for a database. Use the db.printShardingStatus() command or the sh.status() to see an overview of the cluster, which contains information about the chunk and database distribution within the cluster.
The movePrimary command can be expensive because it copies all non-sharded data to the new shard, during which that data will be unavailable for other operations.
When you deploy a new sharded cluster, the “first shard” becomes the primary for all databases before enabling sharding. Databases created subsequently, may reside on any shard in the cluster.
|||As you configure sharding, you will use the enableSharding command to enable sharding for a database. This simply makes it possible to use the shardCollection command on a collection within that database.|
|||The term “primary” in the context of databases and sharding, has nothing to do with the term primary in the context of replica sets.|
Application servers or mongos instances become unavailable.
If each application server has its own mongos instance, other application servers can continue access the database. Furthermore, mongos instances do not maintain persistent state, and they can restart and become unavailable without loosing any state or data. When a mongos instance starts, it retrieves a copy of the config database and can begin routing queries.
A single mongod becomes unavailable in a shard.
Replica sets provide high availability for shards. If the unavailable mongod is a primary, then the replica set will elect a new primary. If the unavailable mongod is a secondary, and it disconnects the primary and secondary will continue to hold all data. In a three member replica set, even if a single member of the set experiences catastrophic failure, two other members have full copies of the data. 
Always investigate availability interruptions and failures. If a system is unrecoverable, replace it and create a new member of the replica set as soon as possible to replace the lost redundancy.
All members of a replica set become unavailable.
If all members of a replica set within a shard are unavailable, all data held in that shard is unavailable. However, the data on all other shards will remain available, and it’s possible to read and write data to the other shards. However, your application must be able to deal with partial results, and you should investigate the cause of the interruption and attempt to recover the shard as soon as possible.
One or two config database become unavailable.
Three distinct mongod instances provide the config database using a special two-phase commits to maintain consistent state between these mongod instances. Cluster operation will continue as normal but chunk migration and the cluster can create no new chunk splits. Replace the config server as soon as possible. If all multiple config databases become unavailable, the cluster can become inoperable.
All config servers must be running and available when you first initiate a sharded cluster.
|||If an unavailable secondary becomes available while it still has current oplog entries, it can catch up to the latest state of the set using the normal replication process, otherwise it must perform an initial sync.|