Chunk Migration Across Shards

Chunk migration moves the chunks of a sharded collection from one shard to another and is part of the balancer process.

Diagram of a collection distributed across three shards. For this collection, the difference in the number of chunks between the shards reaches the *migration thresholds* (in this case, 2) and triggers migration.

Diagram of a collection distributed across three shards. For this collection, the difference in the number of chunks between the shards reaches the migration thresholds (in this case, 2) and triggers migration.

Chunk Migration

MongoDB migrates chunks in a sharded cluster to distribute the chunks of a sharded collection evenly among shards. Migrations may be either:

  • Manual. Only use manual migration in limited cases, such as to distribute data during bulk inserts. See Migrating Chunks Manually for more details.
  • Automatic. The balancer process automatically migrates chunks when there is an uneven distribution of a sharded collection’s chunks across the shards. See Migration Thresholds for more details.

All chunk migrations use the following procedure:

  1. The balancer process sends the moveChunk command to the source shard.

  2. The source starts the move with an internal moveChunk command. During the migration process, operations to the chunk route to the source shard. The source shard is responsible for incoming write operations for the chunk.

  3. The destination shard begins requesting documents in the chunk and starts receiving copies of the data.

  4. After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure that it has the changes to the migrated documents that occurred during the migration.

  5. When fully synchronized, the destination shard connects to the config database and updates the cluster metadata with the new location for the chunk.

  6. After the destination shard completes the update of the metadata, and once there are no open cursors on the chunk, the source shard deletes its copy of the documents.

    Changed in version 2.4: If the balancer needs to perform additional chunk migrations from the source shard, the balancer can start the next chunk migration without waiting for the current migration process to finish this deletion step. See Chunk Migration Queuing.

The migration process ensures consistency and maximizes the availability of chunks during balancing.

Chunk Migration Queuing

Changed in version 2.4.

To migrate multiple chunks from a shard, the balancer migrates the chunks one at a time. However, the balancer does not wait for the current migration’s delete phase to complete before starting the next chunk migration. See Chunk Migration for the chunk migration process and the delete phase.

This queuing behavior allows shards to unload chunks more quickly in cases of heavily imbalanced cluster, such as when performing initial data loads without pre-splitting and when adding new shards.

This behavior also affect the moveChunk command, and migration scripts that use the moveChunk command may proceed more quickly.

In some cases, the delete phases may persist longer. If multiple delete phases are queued but not yet complete, a crash of the replica set’s primary can orphan data from multiple migrations.

Chunk Migration and Replication

By default, each document move during chunk migration propagates to at least one secondary before the balancer proceeds with its next operation.

To override this behavior and allow the balancer to continue before replicating to a secondary, set the _secondaryThrottle parameter to false. See Change Replication Behavior for Chunk Migration (Secondary Throttle) to update the _secondaryThrottle parameter for the balancer.

Independent of the secondaryThrottle setting, certain operations of the chunk migration have the following replication policy:

  • MongoDB briefly pauses all application writes to the source shard before updating the config servers with the new location for the chunk, and resumes the application writes after the update. The chunk commit requires all writes to be durably replicated to a majority of servers in order to proceed and finish.
  • When an outgoing chunk migration finishes and cleanup occurs, all writes must be replicated to a majority of servers before further cleanup (from other outgoing migrations) or new incoming migrations can proceed.

Changed in version 2.4: In previous versions, the balancer did not wait for the document move to replicate to a secondary. For details, see Secondary Throttle in the v2.2 Manual