Navigation
This version of the documentation is archived and no longer supported.

Aggregation Pipeline and Sharded Collections

The aggregation pipeline supports operations on sharded collections. This section describes behaviors specific to the aggregation pipeline and sharded collections.

Behavior

When operating on a sharded collection, the aggregation pipeline is split into two parts. The first pipeline runs on each shard, or if an early $match can exclude shards through the use of the shard key in the predicate, the pipeline runs on only the relevant shards.

The second pipeline consists of the remaining pipeline stages and runs on the mongos. The mongos merges the cursors from the other shards and runs the second pipeline on these results.

When splitting the aggregation pipeline into two parts, the pipeline is split to ensure that the shards perform as many stages as possible.

Impact of Aggregation Pipelines on mongos

Changed in version 2.2.

Some aggregation pipeline operations will cause mongos instances to require more CPU resources than in previous versions. This modified performance profile may dictate alternate architectural decisions if you use the aggregation pipeline extensively in a sharded environment.