MongoDB Connector for Hadoop

The MongoDB Connector for Hadoop is a plugin for Hadoop that provides the ability to use MongoDB as an input source and/or an output destination.

The source code is available on Github where you can find a more comprehensive wiki.

If you have questions please email the mongodb-user Mailing List. For any issues please file a ticket in Jira.

This guide also includes the following documentation:


  1. Obtain the Hadoop connector. You may either download the JARs or build it yourself. The JARs are universal and will work with any version of Hadoop.
  2. Obtain a JAR for the MongoDB Java Driver.
  3. Move these JARs onto each node of the Hadoop cluster. You may provision each node so that the jars are in the lib directory for hadoop (i.e. $HADOOP_HOME/lib), or you can use the Hadoop DistributedCache to move the JARs onto pre-existing nodes.

For more complete install instructions, please see the installation instructions for your platform on the Hadoop Connector wiki.