OPTIONS

Model Tree Structures in MongoDB

To model hierarchical or nested data relationships, you can use references to implement tree-like structures. The following Tree pattern examples model book categories that have hierarchical relationships.

Model Tree Structures with Child References

(link)

The Child References pattern stores each tree node in a document; in addition to the tree node, document stores in an array the id(s) of the node’s children.

Consider the following hierarchy of categories:

Tree data model for a sample hierarchy of categories.

The following example models the tree using Child References, storing the reference to the node’s children in the field children:

db.categories.insert( { _id: "MongoDB", children: [] } )
db.categories.insert( { _id: "dbm", children: [] } )
db.categories.insert( { _id: "Databases", children: [ "MongoDB", "dbm" ] } )
db.categories.insert( { _id: "Languages", children: [] } )
db.categories.insert( { _id: "Programming", children: [ "Databases", "Languages" ] } )
db.categories.insert( { _id: "Books", children: [ "Programming" ] } )
  • The query to retrieve the immediate children of a node is fast and straightforward:

    db.categories.findOne( { _id: "Databases" } ).children
    
  • You can create an index on the field children to enable fast search by the child nodes:

    db.categories.ensureIndex( { children: 1 } )
    
  • You can query for a node in the children field to find its parent node as well as its siblings:

    db.categories.find( { children: "MongoDB" } )
    

The Child References pattern provides a suitable solution to tree storage as long as no operations on subtrees are necessary. This pattern may also provide a suitable solution for storing graphs where a node may have multiple parents.

Model Tree Structures with Parent References

(link)

The Parent References pattern stores each tree node in a document; in addition to the tree node, the document stores the id of the node’s parent.

Consider the following hierarchy of categories:

Tree data model for a sample hierarchy of categories.

The following example models the tree using Parent References, storing the reference to the parent category in the field parent:

db.categories.insert( { _id: "MongoDB", parent: "Databases" } )
db.categories.insert( { _id: "dbm", parent: "Databases" } )
db.categories.insert( { _id: "Databases", parent: "Programming" } )
db.categories.insert( { _id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )
  • The query to retrieve the parent of a node is fast and straightforward:

    db.categories.findOne( { _id: "MongoDB" } ).parent
    
  • You can create an index on the field parent to enable fast search by the parent node:

    db.categories.ensureIndex( { parent: 1 } )
    
  • You can query by the parent field to find its immediate children nodes:

    db.categories.find( { parent: "Databases" } )
    

The Parent Links pattern provides a simple solution to tree storage but requires multiple queries to retrieve subtrees.

Model Tree Structures with an Array of Ancestors

(link)

The Array of Ancestors pattern stores each tree node in a document; in addition to the tree node, document stores in an array the id(s) of the node’s ancestors or path.

Consider the following hierarchy of categories:

Tree data model for a sample hierarchy of categories.

The following example models the tree using Array of Ancestors. In addition to the ancestors field, these documents also store the reference to the immediate parent category in the parent field:

db.categories.insert( { _id: "MongoDB", ancestors: [ "Books", "Programming", "Databases" ], parent: "Databases" } )
db.categories.insert( { _id: "dbm", ancestors: [ "Books", "Programming", "Databases" ], parent: "Databases" } )
db.categories.insert( { _id: "Databases", ancestors: [ "Books", "Programming" ], parent: "Programming" } )
db.categories.insert( { _id: "Languages", ancestors: [ "Books", "Programming" ], parent: "Programming" } )
db.categories.insert( { _id: "Programming", ancestors: [ "Books" ], parent: "Books" } )
db.categories.insert( { _id: "Books", ancestors: [ ], parent: null } )
  • The query to retrieve the ancestors or path of a node is fast and straightforward:

    db.categories.findOne( { _id: "MongoDB" } ).ancestors
    
  • You can create an index on the field ancestors to enable fast search by the ancestors nodes:

    db.categories.ensureIndex( { ancestors: 1 } )
    
  • You can query by the field ancestors to find all its descendants:

    db.categories.find( { ancestors: "Programming" } )
    

The Array of Ancestors pattern provides a fast and efficient solution to find the descendants and the ancestors of a node by creating an index on the elements of the ancestors field. This makes Array of Ancestors a good choice for working with subtrees.

The Array of Ancestors pattern is slightly slower than the Materialized Paths pattern but is more straightforward to use.

Model Tree Structures with Materialized Paths

(link)

The Materialized Paths pattern stores each tree node in a document; in addition to the tree node, document stores as a string the id(s) of the node’s ancestors or path. Although the Materialized Paths pattern requires additional steps of working with strings and regular expressions, the pattern also provides more flexibility in working with the path, such as finding nodes by partial paths.

Consider the following hierarchy of categories:

Tree data model for a sample hierarchy of categories.

The following example models the tree using Materialized Paths, storing the path in the field path; the path string uses the comma , as a delimiter:

db.categories.insert( { _id: "Books", path: null } )
db.categories.insert( { _id: "Programming", path: ",Books," } )
db.categories.insert( { _id: "Databases", path: ",Books,Programming," } )
db.categories.insert( { _id: "Languages", path: ",Books,Programming," } )
db.categories.insert( { _id: "MongoDB", path: ",Books,Programming,Databases," } )
db.categories.insert( { _id: "dbm", path: ",Books,Programming,Databases," } )
  • You can query to retrieve the whole tree, sorting by the field path:

    db.categories.find().sort( { path: 1 } )
    
  • You can use regular expressions on the path field to find the descendants of Programming:

    db.categories.find( { path: /,Programming,/ } )
    
  • You can also retrieve the descendants of Books where the Books is also at the topmost level of the hierarchy:

    db.categories.find( { path: /^,Books,/ } )
    
  • To create an index on the field path use the following invocation:

    db.categories.ensureIndex( { path: 1 } )
    

    This index may improve performance depending on the query:

    • For queries of the Books sub-tree (e.g. /^,Books,/) an index on the path field improves the query performance significantly.

    • For queries of the Programming sub-tree (e.g. /,Programming,/), or similar queries of sub-tress, where the node might be in the middle of the indexed string, the query must inspect the entire index.

      For these queries an index may provide some performance improvement if the index is significantly smaller than the entire collection.

Model Tree Structures with Nested Sets

(link)

The Nested Sets pattern identifies each node in the tree as stops in a round-trip traversal of the tree. The application visits each node in the tree twice; first during the initial trip, and second during the return trip. The Nested Sets pattern stores each tree node in a document; in addition to the tree node, document stores the id of node’s parent, the node’s initial stop in the left field, and its return stop in the right field.

Consider the following hierarchy of categories:

Example of a hierarchical data. The numbers identify the stops at nodes during a roundtrip traversal of a tree.

The following example models the tree using Nested Sets:

db.categories.insert( { _id: "Books", parent: 0, left: 1, right: 12 } )
db.categories.insert( { _id: "Programming", parent: "Books", left: 2, right: 11 } )
db.categories.insert( { _id: "Languages", parent: "Programming", left: 3, right: 4 } )
db.categories.insert( { _id: "Databases", parent: "Programming", left: 5, right: 10 } )
db.categories.insert( { _id: "MongoDB", parent: "Databases", left: 6, right: 7 } )
db.categories.insert( { _id: "dbm", parent: "Databases", left: 8, right: 9 } )

You can query to retrieve the descendants of a node:

var databaseCategory = db.categories.findOne( { _id: "Databases" } );
db.categories.find( { left: { $gt: databaseCategory.left }, right: { $lt: databaseCategory.right } } );

The Nested Sets pattern provides a fast and efficient solution for finding subtrees but is inefficient for modifying the tree structure. As such, this pattern is best for static trees that do not change.