OPTIONS

FAQ: MongoDB Storage

This document addresses common questions regarding MongoDB’s storage system.

If you don’t find the answer you’re looking for, check the complete list of FAQs or post your question to the MongoDB User Mailing List.

Storage Engine Fundamentals

What is a storage engine?

A storage engine is the part of a database that is responsible for managing how data is stored on disk. Many databases support multiple storage engines, where different engines perform better for specific workloads. For example, one storage engine might offer better performance for read-heavy workloads, and another might support a higher-throughput for write operations.

What will be the default storage engine going forward?

MMAPv1 is the default storage engine in 3.0. With multiple storage engines, you can decide which storage engine is best for your application.

Can you mix storage engines in a replica set?

Yes. You can have a replica set members that use different storage engines.

When designing these multi-storage engine deployments consider the following:

  • the oplog on each member may need to be sized differently to account for differences in throughput between different storage engines.
  • recovery from backups may become more complex if your backup captures data files from MongoDB: you may need to maintain backups for each storage engine.

WiredTiger Storage Engine

Can I upgrade an existing deployment to a WiredTiger?

Yes. You can upgrade an existing deployment to WiredTiger while the deployment remains available by adding replica set members with the new storage engine and then removing members with the legacy storage engine. See the following sections of the Upgrade MongoDB to 3.0 for the complete procedure that you can use to upgrade an existing deployment:

How much compression does WiredTiger provide?

The ratio of compressed data to uncompressed data depends on your data and the compression library used. By default, collection data in WiredTiger use Snappy block compression; zlib compression is also available. Index data use prefix compression by default.

To what size should I set the WiredTiger cache?

The size of the cache should be sufficient to hold the entire working set for the mongod. If the cache does not have enough space to load additional data, WiredTiger evicts pages from the cache to free up space.

To see statistics on the cache and eviction, use the serverStatus command. The cache field holds the information on the cache and eviction:

...
"wiredTiger" : {
   ...
   "cache" : {
      "tracked dirty bytes in the cache" : <num>,
      "bytes currently in the cache" : <num>,
      "maximum bytes configured" : <num>,
      "bytes read into cache" :<num>,
      "bytes written from cache" : <num>,
      "pages evicted by application threads" : <num>,
      "checkpoint blocked page eviction" : <num>,
      "unmodified pages evicted" : <num>,
      "page split during eviction deepened the tree" : <num>,
      "modified pages evicted" : <num>,
      "pages selected for eviction unable to be evicted" : <num>,
      "pages evicted because they exceeded the in-memory maximum" : <num>,,
      "pages evicted because they had chains of deleted items" : <num>,
      "failed eviction of pages that exceeded the in-memory maximum" : <num>,
      "hazard pointer blocked page eviction" : <num>,
      "internal pages evicted" : <num>,
      "maximum page size at eviction" : <num>,
      "eviction server candidate queue empty when topping up" : <num>,
      "eviction server candidate queue not empty when topping up" : <num>,
      "eviction server evicting pages" : <num>,
      "eviction server populating queue, but not evicting pages" : <num>,
      "eviction server unable to reach eviction goal" : <num>,
      "pages split during eviction" : <num>,
      "pages walked for eviction" : <num>,
      "eviction worker thread evicting pages" : <num>,
      "in-memory page splits" : <num>,
      "percentage overhead" : <num>,
      "tracked dirty pages in the cache" : <num>,
      "pages currently held in the cache" : <num>,
      "pages read into cache" : <num>,
      "pages written from cache" : <num>,
   },
   ...

To adjust the size of the WiredTiger cache, see storage.wiredTiger.engineConfig.cacheSizeGB and --wiredTigerCacheSizeGB.

MMAPv1 Storage Engine

What are memory mapped files?

A memory-mapped file is a file with data that the operating system places in memory by way of the mmap() system call. mmap() thus maps the file to a region of virtual memory. Memory-mapped files are the critical piece of the MMAPv1 storage engine in MongoDB. By using memory mapped files, MongoDB can treat the contents of its data files as if they were in memory. This provides MongoDB with an extremely fast and simple method for accessing and manipulating data.

How do memory mapped files work?

MongoDB uses memory mapped files for managing and interacting with all data.

Memory mapping assigns files to a block of virtual memory with a direct byte-for-byte correlation. MongoDB memory maps data files to memory as it accesses documents. Unaccessed data is not mapped to memory.

Once mapped, the relationship between file and memory allows MongoDB to interact with the data in the file as if it were memory.

Why are the files in my data directory larger than the data in my database?

The data files in your data directory, which is the /data/db directory in default configurations, might be larger than the data set inserted into the database. Consider the following possible causes:

Preallocated data files

MongoDB preallocates its data files to avoid filesystem fragmentation, and because of this, the size of these files do not necessarily reflect the size of your data.

The storage.mmapv1.smallFiles option will reduce the size of these files, which may be useful if you have many small databases on disk.

The oplog

If this mongod is a member of a replica set, the data directory includes the oplog.rs file, which is a preallocated capped collection in the local database.

The default allocation is approximately 5% of disk space on 64-bit installations. In most cases, you should not need to resize the oplog. See Oplog Sizing for more information.

The journal

The data directory contains the journal files, which store write operations on disk before MongoDB applies them to databases. See Journaling Mechanics.

Empty records

MongoDB maintains lists of empty records in data files as it deletes documents and collections. MongoDB can reuse this space, but will not, by default, return this space to the operating system.

To de-fragment allocated storage, use compact. By de-fragmenting storage, MongoDB can more effectively use the allocated space. compact requires up to 2 gigabytes of extra disk space to run. Do not use compact if you are critically low on disk space.

compact only removes fragmentation from MongoDB data files within a collection, and does not return any disk space to the operating system.

If you must reclaim disk space, you can use repairDatabase. This command rebuilds the database, de-fragmenting the associated storage in the process. This may release space to the operating system. repairDatabase requires up to 2 gigabytes of extra disk space to run. Do not use repairDatabase if you are critically low on disk space.

Warning

repairDatabase requires enough free disk space to hold both the old and new database files while the repair is running. Be aware that repairDatabase will block all other operations and may take a long time to complete.

What is the working set?

Working set represents the total body of data that the application uses in the course of normal operation. Often this is a subset of the total data size, but the specific size of the working set depends on actual moment-to-moment use of the database.

If you run a query that requires MongoDB to scan every document in a collection, the working set will expand to include every document. Depending on physical memory size, this may cause documents in the working set to “page out,” or to be removed from physical memory by the operating system. The next time MongoDB needs to access these documents, MongoDB may incur a hard page fault.

For best performance, the majority of your active set should fit in RAM.

What are page faults?

With the MMAPv1 storage engine, page faults can occur as MongoDB reads from or writes data to parts of its data files that are not currently located in physical memory. In contrast, operating system page faults happen when physical memory is exhausted and pages of physical memory are swapped to disk.

If there is free memory, then the operating system can find the page on disk and load it to memory directly. However, if there is no free memory, the operating system must:

  • find a page in memory that is stale or no longer needed, and write the page to disk.
  • read the requested page from disk and load it into memory.

This process, on an active system, can take a long time, particularly in comparison to reading a page that is already in memory.

See Page Faults for MMAPv1 Storage Engine for more information.

What is the difference between soft and hard page faults?

Page faults occur when MongoDB, with the MMAP storage engine, needs access to data that isn’t currently in active memory. A “hard” page fault refers to situations when MongoDB must access a disk to access the data. A “soft” page fault, by contrast, merely moves memory pages from one list to another, such as from an operating system file cache.

See Page Faults for MMAPv1 Storage Engine for more information.

Data Storage Diagnostics

How can I check the size of a collection?

To view the statistics for a collection, including the data size, use the db.collection.stats() method from the mongo shell. The following example issues db.collection.stats() for the orders collection:

db.orders.stats();

MongoDB also provides the following methods to return specific sizes for the collection:

The following script prints the statistics for each database:

db._adminCommand("listDatabases").databases.forEach(function (d) {
   mdb = db.getSiblingDB(d.name);
   printjson(mdb.stats());
})

The following script prints the statistics for each collection in each database:

db._adminCommand("listDatabases").databases.forEach(function (d) {
   mdb = db.getSiblingDB(d.name);
   mdb.getCollectionNames().forEach(function(c) {
      s = mdb[c].stats();
      printjson(s);
   })
})

How can I check the size of indexes for a collection?

To view the size of the data allocated for an index, use the db.collection.stats() method and check the indexSizes field in the returned document.

How can I get information on the storage use of a database?

The db.stats() method in the mongo shell returns the current state of the “active” database. For the description of the returned fields, see dbStats Output.