OPTIONS

Backup a Sharded Cluster with Filesystem Snapshots

Overview

This document describes a procedure for taking a backup of all components of a sharded cluster. This procedure uses file system snapshots to capture a copy of the mongod instance. An alternate procedure uses mongodump to create binary database dumps when file-system snapshots are not available. See Backup a Sharded Cluster with Database Dumps for the alternate procedure.

See MongoDB Backup Methods and Backup and Restore Sharded Clusters for complete information on backups in MongoDB and backups of sharded clusters in particular.

Important

To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a running production system, you can only capture an approximation of point-in-time snapshot.

Procedure

In this procedure, you will stop the cluster balancer and take a backup up of the config database, and then take backups of each shard in the cluster using a file-system snapshot tool. If you need an exact moment-in-time snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots; otherwise the snapshot will only approximate a moment in time.

For approximate point-in-time snapshots, you can improve the quality of the backup while minimizing impact on the cluster by taking the backup from a secondary member of the replica set that provides each shard.

  1. Disable the balancer process that equalizes the distribution of data among the shards. To disable the balancer, use the sh.stopBalancer() method in the mongo shell. For example:

    use config
    sh.stopBalancer()
    

    For more information, see the Disable the Balancer procedure.

    Warning

    It is essential that you stop the balancer before creating backups. If the balancer remains active, your resulting backups could have duplicate data or miss some data, as chunks may migrate while recording backups.

  2. Lock one secondary member of each replica set in each shard so that your backups reflect the state of your database at the nearest possible approximation of a single moment in time. Lock these mongod instances in as short of an interval as possible.

    To lock a secondary, connect through the mongo shell to the secondary member’s mongod instance and issue the db.fsyncLock() method.

  3. Back up one of the config servers. Backing up a config server backs up the sharded cluster’s metadata. You need back up only one config server, as they all hold the same data

    Do one of the following to back up one of the config servers:

    • Create a file-system snapshot of the config server. Use the procedure in Backup and Restore with Filesystem Snapshots.

      Important

      This is only available if the config server has journaling enabled. Never use db.fsyncLock() on config databases.

    • Use mongodump to backup the config server. Issue mongodump against one of the config mongod instances or via the mongos.

      If you are running MongoDB 2.4 or later with the --configsvr option, then include the --oplog option when running mongodump to ensure that the dump includes a partial oplog containing operations from the duration of the mongodump operation. For example:

      mongodump --oplog --db config
      
  4. Back up the replica set members of the shards that you locked. You may back up the shards in parallel. For each shard, create a snapshot. Use the procedure in Backup and Restore with Filesystem Snapshots.

  5. Unlock all locked replica set members of each shard using the db.fsyncUnlock() method in the mongo shell.

  6. Re-enable the balancer with the sh.setBalancerState() method.

    Use the following command sequence when connected to the mongos with the mongo shell:

    use config
    sh.setBalancerState(true)