- Sharding >
- Sharded Cluster Tutorials >
- Sharded Cluster Data Management >
- Shard GridFS Data Store
Shard GridFS Data Store¶
On this page
When sharding a GridFS store, consider the following:
files
Collection¶
Most deployments will not need to shard the files
collection. The files
collection is typically small, and only
contains metadata. None of the required keys for GridFS lend
themselves to an even distribution in a sharded situation. If you
must shard the files
collection, use the _id
field
possibly in combination with an application field.
Leaving files
unsharded means that all the file metadata
documents live on one shard. For production GridFS stores you must
store the files
collection on a replica set.
chunks
Collection¶
To shard the chunks
collection by { files_id : 1 , n : 1 }
,
issue commands similar to the following:
You may also want to shard using just the file_id
field, as in
the following operation:
Important
{ files_id : 1 , n : 1 }
and { files_id : 1 }
are the only supported shard keys for the chunks
collection
of a GridFS store.
Note
Changed in version 2.2.
Before 2.2, you had to create an additional index on files_id
to shard using only this field.
The default files_id
value is an ObjectId, as a result
the values of files_id
are always ascending, and applications
will insert all new GridFS data to a single chunk and shard. If
your write load is too high for a single server to handle, consider
a different shard key or use a different value
for _id
in the files
collection.