OPTIONS

Convert a Replica Set to a Replicated Sharded Cluster

Overview

This tutorial converts a single three-member replica set to a sharded cluster with two shards. Each shard is an independent three-member replica set. The procedure is as follows:

  1. Create the initial three-member replica set and insert data into a collection. See Set Up Initial Replica Set.
  2. Start the config databases and a mongos. See Deploy Config Databases and mongos.
  3. Add the initial replica set as a shard. See Add Initial Replica Set as a Shard.
  4. Create a second shard and add to the cluster. See Add Second Shard.
  5. Shard the desired collection. See Shard a Collection.

Prerequisites

This tutorial uses a total of ten servers: one server for the mongos and three servers each for the first replica set, the second replica set, and the config servers.

Each server must have a resolvable domain, hostname, or IP address within your system.

The tutorial uses the default data directories (e.g. /data/db and /data/configdb). Create the appropriate directories with appropriate permissions. To use different paths, see Configuration File Options .

The tutorial uses the default ports (e.g. 27017 and 27019). To use different ports, see Configuration File Options.

Considerations

In production deployments, use exactly three config servers. Each config server must be on a separate machine.

In development and testing environments, you can deploy a cluster with a single config server.

Procedures

Set Up Initial Replica Set

This procedure creates the initial three-member replica set rs0. The replica set members are on the following hosts: mongodb0.example.net, mongodb1.example.net, and mongodb2.example.net.

1

Start each member of the replica set with the appropriate options.

For each member, start a mongod, specifying the replica set name through the replSet option. Include any other parameters specific to your deployment. For replication-specific parameters, see Replication Options.

mongod --replSet "rs0"

Repeat this step for the other two members of the rs0 replica set.

2

Connect a mongo shell to a replica set member.

Connect a mongo shell to one member of the replica set (e.g. mongodb0.example.net)

mongo mongodb0.example.net
3

Initiate the replica set.

From the mongo shell, run rs.initiate() to initiate a replica set that consists of the current member.

rs.initiate()
4

Add the remaining members to the replica set.

rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
5

Create and populate a new collection.

The following step adds one million documents to the collection test_collection and can take several minutes depending on your system.

Issue the following operations on the primary of the replica set:

use test
var bulk = db.test_collection.initializeUnorderedBulkOp();
people = ["Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy", "Greg", "Steve", "Kristina", "Katie", "Jeff"];
for(var i=0; i<1000000; i++){
   user_id = i;
   name = people[Math.floor(Math.random()*people.length)];
   number = Math.floor(Math.random()*10001);
   bulk.insert( { "user_id":user_id, "name":name, "number":number });
}
bulk.execute();

For more information on deploying a replica set, see Deploy a Replica Set.

Deploy Config Databases and mongos

This procedure deploys the three config servers and the mongos. The config servers use the following hosts: mongodb7.example.net, mongodb8.example.net, and mongodb9.example.net; the mongos uses mongodb6.example.net.

1

Start three config databases.

On each mongodb7.example.net, mongodb8.example.net, and mongodb9.example.net server, start the config server using default data directory /data/configdb and the default port 27019:

mongod --configsvr

To modify the default settings or to include additional options specific to your deployment, see Configuration File Options.

2

Start a mongos instance.

On mongodb6.example.net, start the mongos specifying the config servers. The mongos runs on the default port 27017.

This tutorial specifies a small --chunkSize of 1 MB to test sharding with the test_collection created earlier.

Note

In production environments, do not use a small chunkSize size.

mongos --configdb mongodb07.example.net:27019,mongodb08.example.net:27019,mongodb09.example.net:27019 --chunkSize 1

Add Initial Replica Set as a Shard

The following procedure adds the initial replica set rs0 as a shard.

1

Connect a mongo shell to the mongos.

mongo mongodb6.example.net:27017/admin
2

Add the shard.

Add a shard to the cluster with the sh.addShard method:

sh.addShard( "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" )

Add Second Shard

The following procedure deploys a new replica set rs1 for the second shard and adds it to the cluster. The replica set members are on the following hosts: mongodb3.example.net, mongodb4.example.net, and mongodb5.example.net.

1

Start each member of the replica set with the appropriate options.

For each member, start a mongod, specifying the replica set name through the replSet option. Include any other parameters specific to your deployment. For replication-specific parameters, see Replication Options.

mongod --replSet "rs1"

Repeat this step for the other two members of the rs1 replica set.

2

Connect a mongo shell to a replica set member.

Connect a mongo shell to one member of the replica set (e.g. mongodb3.example.net)

mongo mongodb3.example.net
3

Initiate the replica set.

From the mongo shell, run rs.initiate() to initiate a replica set that consists of the current member.

rs.initiate()
4

Add the remaining members to the replica set.

Add the remaining members with the rs.add() method.

rs.add("mongodb4.example.net")
rs.add("mongodb5.example.net")
5

Connect a mongo shell to the mongos.

mongo mongodb6.example.net:27017/admin
6

Add the shard.

In a mongo shell connected to the mongos, add the shard to the cluster with the sh.addShard() method:

sh.addShard( "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" )

Shard a Collection

1

Connect a mongo shell to the mongos.

mongo mongodb6.example.net:27017/admin
2

Enable sharding for a database.

Before you can shard a collection, you must first enable sharding for the collection’s database. Enabling sharding for a database does not redistribute data but makes it possible to shard the collections in that database.

The following operation enables sharding on the test database:

sh.enableSharding( "test" )

The operation returns the status of the operation:

{ "ok" : 1 }
3

Determine the shard key.

For the collection to shard, determine the shard key. The shard key determines how MongoDB distributes the documents between shards. Good shard keys:

  • have values that are evenly distributed among all documents,
  • group documents that are often accessed at the same time into contiguous chunks, and
  • allow for effective distribution of activity among shards.

Once you shard a collection with the specified shard key, you cannot change the shard key. For more information on shard keys, see Shard Keys and Considerations for Selecting Shard Keys.

This procedure will use the number field as the shard key for test_collection.

4

Create an index on the shard key.

Before sharding a non-empty collection, create an index on the shard key.

use test
db.test_collection.ensureIndex( { number : 1 } )
5

Shard the collection.

In the test database, shard the test_collection, specifying number as the shard key.

use test
sh.shardCollection( "test.test_collection", { "number" : 1 } )

The method returns the status of the operation:

{ "collectionsharded" : "test.test_collection", "ok" : 1 }

The balancer will redistribute chunks of documents when it next runs. As clients insert additional documents into this collection, the mongos will route the documents between the shards.

6

Confirm the shard is balancing.

To confirm balancing activity, run db.stats() or db.printShardingStatus() in the test database.

use test
db.stats()
db.printShardingStatus()

Example output of the db.stats():

{
  "raw" : {
      "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" : {
         "db" : "test",
         "collections" : 3,
         "objects" : 989316,
         "avgObjSize" : 111.99974123535857,
         "dataSize" : 110803136,
         "storageSize" : 174751744,
         "numExtents" : 14,
         "indexes" : 2,
         "indexSize" : 57370992,
         "fileSize" : 469762048,
         "nsSizeMB" : 16,
         "dataFileVersion" : {
            "major" : 4,
            "minor" : 5
         },
         "extentFreeList" : {
            "num" : 0,
            "totalSize" : 0
         },
         "ok" : 1
      },
      "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" : {
         "db" : "test",
         "collections" : 3,
         "objects" : 14697,
         "avgObjSize" : 111.98258147921345,
         "dataSize" : 1645808,
         "storageSize" : 2809856,
         "numExtents" : 7,
         "indexes" : 2,
         "indexSize" : 1169168,
         "fileSize" : 67108864,
         "nsSizeMB" : 16,
         "dataFileVersion" : {
            "major" : 4,
            "minor" : 5
         },
         "extentFreeList" : {
            "num" : 0,
            "totalSize" : 0
         },
         "ok" : 1
      }
  },
  "objects" : 1004013,
  "avgObjSize" : 111,
  "dataSize" : 112448944,
  "storageSize" : 177561600,
  "numExtents" : 21,
  "indexes" : 4,
  "indexSize" : 58540160,
  "fileSize" : 536870912,
  "extentFreeList" : {
      "num" : 0,
      "totalSize" : 0
  },
  "ok" : 1
}

Example output of the db.printShardingStatus():

--- Sharding Status ---
sharding version: {
   "_id" : 1,
   "version" : 4,
   "minCompatibleVersion" : 4,
   "currentVersion" : 5,
   "clusterId" : ObjectId("5446970c04ad5132c271597c")
}
shards:
   {  "_id" : "rs0",  "host" : "rs0/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net:27017" }
   {  "_id" : "rs1",  "host" : "rs1/mongodb3.example.net:27017,mongodb4.example.net:27017,mongodb5.example.net:27017" }
databases:
   {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
   {  "_id" : "test",  "partitioned" : true,  "primary" : "rs0" }

test.test_collection
      shard key: { "number" : 1 }
      chunks:
         rs1    5
         rs0    186
      too many chunks to print, use verbose if you want to force print

Run these commands for a second time to demonstrate that chunks are migrating from rs0 to rs1.