Backups are an important part of any operational disaster recovery plan. A good backup plan must be able to capture data in a consistent and usable state, and operators must be able to automate both the backup and the recovery operations. Also test all components of the backup system to ensure that you can recover backed up data as needed. If you cannot effectively restore your database from the backup, then your backups are useless.
As you develop a backup strategy for your MongoDB deployment consider the following factors:
There are two main methodologies for backing up MongoDB instances. Creating binary “dumps” of the database using mongodump or creating filesystem level snapshots. Both methodologies have advantages and disadvantages:
The best option depends on the requirements of your deployment and disaster recovery needs. Typically, filesystem snapshots are because of their accuracy and simplicity; however, mongodump is a viable option used often to generate backups of MongoDB systems.
In some cases, taking backups is difficult or impossible because of large data volumes, distributed architectures, and data transmission speeds. In these situations, increase the number of members in your replica set or sets.
In most cases, backing up data stored in a replica set is similar to backing up data stored in a single instance. Options include:
If you have a sharded cluster where each shard is itself a replica set, you can use one of these methods to create a backup of the entire cluster without disrupting the operation of the node. In these situations you should still turn off the balancer when you create backups.
For any cluster, using a non-primary node to create backups is particularly advantageous in that the backup operation does not affect the performance of the primary. Replication itself provides some measure of redundancy. Nevertheless, keeping point-in time backups of your cluster to provide for disaster recovery and as an additional layer of protection is crucial.