OPTIONS

Design Notes

This page details features of MongoDB that may be important to bear in mind when designing your applications.

Schema Considerations

Dynamic Schema

Data in MongoDB has a dynamic schema. Collections do not enforce document structure. This facilitates iterative development and polymorphism. Nevertheless, collections often hold documents with highly homogeneous structures. See Data Modeling Concepts for more information.

Some operational considerations include:

  • the exact set of collections to be used;
  • the indexes to be used: with the exception of the _id index, all indexes must be created explicitly;
  • shard key declarations: choosing a good shard key is very important as the shard key cannot be changed once set.

Avoid importing unmodified data directly from a relational database. In general, you will want to “roll up” certain data into richer documents that take advantage of MongoDB’s support for sub-documents and nested arrays.

Case Sensitive Strings

MongoDB strings are case sensitive. So a search for "joe" will not find "Joe".

Consider:

Type Sensitive Fields

MongoDB data is stored in the BSON format, a binary encoded serialization of JSON-like documents. BSON encodes additional type information. See bsonspec.org for more information.

Consider the following document which has a field x with the string value "123":

{ x : "123" }

Then the following query which looks for a number value 123 will not return that document:

db.mycollection.find( { x : 123 } )

General Considerations

By Default, Updates Affect one Document

To update multiple documents that meet your query criteria, set the update multi option to true or 1. See: Update Multiple Documents.

Prior to MongoDB 2.2, you would specify the upsert and multi options in the update method as positional boolean options. See: the update method reference documentation.

BSON Document Size Limit

The BSON Document Size limit is currently set at 16MB per document. If you require larger documents, use GridFS.

No Fully Generalized Transactions

MongoDB does not have fully generalized transactions. If you model your data using rich documents that closely resemble your application’s objects, each logical object will be in one MongoDB document. MongoDB allows you to modify a document in a single atomic operation. These kinds of data modification pattern covers most common uses of transactions in other systems.

Replica Set Considerations

Use an Odd Number of Replica Set Members

Replica sets perform consensus elections. To ensure that elections will proceed successfully, either use an odd number of members, typically three, or else use an arbiter to ensure an odd number of votes.

Keep Replica Set Members Up-to-Date

MongoDB replica sets support automatic failover. It is important for your secondaries to be up-to-date. There are various strategies for assessing consistency:

  1. Use monitoring tools to alert you to lag events. See Monitoring for MongoDB for a detailed discussion of MongoDB’s monitoring options.
  2. Specify appropriate write concern.
  3. If your application requires manual fail over, you can configure your secondaries as priority 0. Priority 0 secondaries require manual action for a failover. This may be practical for a small replica set, but large deployments should fail over automatically.

Sharding Considerations

  • Pick your shard keys carefully. You cannot choose a new shard key for a collection that is already sharded.

  • Shard key values are immutable.

  • When enabling sharding on an existing collection, MongoDB imposes a maximum size on those collections to ensure that it is possible to create chunks. For a detailed explanation of this limit, see: <sharding-existing-collection-data-size>.

    To shard large amounts of data, create a new empty sharded collection, and ingest the data from the source collection using an application level import operation.

  • Unique indexes are not enforced across shards except for the shard key itself. See Enforce Unique Keys for Sharded Collections.

  • Consider pre-splitting a sharded collection before a massive bulk import.