Skip to content

Map reduce 2 #441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 30, 2012
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 49 additions & 4 deletions source/applications/map-reduce.txt
Original file line number Diff line number Diff line change
Expand Up @@ -169,8 +169,9 @@ Run the first map-reduce operation as follows:
Subsequent Incremental Map-Reduce
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Later when, the ``sessions`` collection grows, by adding the following
documents, you can run additional map-reduce operations:
Later as the ``sessions`` collection grows, you can run additional
map-reduce operations. For example, add new documents to the
``sessions`` collection:

.. code-block:: javascript

Expand Down Expand Up @@ -208,6 +209,50 @@ periodically with the same target collection name without affecting
the intermediate states. Use this mode when
generating statistical output collections on a regular basis.

.. _map-reduce-concurrency:

Concurrency
-----------

The map-reduce operation is composed of many tasks, including:

- reads from the input collection,

- executions of the ``map`` function,

- executions of the ``reduce`` function,

- writes to the output collection.

These various tasks take the following locks:

- The read phase takes a read lock. It yields every 100 documents.

- The JavaScript code (i.e. ``map``, ``reduce``, ``finalize``
functions) is executed in a single thread, taking a JavaScript lock;
however, most JavaScript tasks in map-reduce are very short and
yield the lock frequently.

- The insert into the temporary collection takes a write lock for a
single write.

If the output collection does not exist, the creation of the output
collection takes a write lock.

If the output collection exists, then the output actions (i.e.
``merge``, ``replace``, ``reduce``) take a write lock.

Although single-threaded, the map-reduce tasks interleave and appear to
run in parallel.

.. note::

The final write lock during post-processing makes the results appear
atomically. However, output actions ``merge`` and ``reduce`` may
take minutes to process. For the ``merge`` and ``reduce``, the
``nonAtomic`` flag is available. See the
:method:`db.collection.mapReduce()` reference for more information.

.. _map-reduce-sharded-cluster:

Sharded Cluster
Expand Down Expand Up @@ -271,10 +316,10 @@ In MongoDB 2.0:

.. warning::

For best results only use the sharded output options for
For best results, only use the sharded output options for
:dbcommand:`mapReduce` in version 2.2 or later.

Troubleshooting Map Reduce Operations
Troubleshooting Map-Reduce Operations
-------------------------------------

You can troubleshoot the ``map`` function and the ``reduce`` function
Expand Down
5 changes: 3 additions & 2 deletions source/includes/parameters-map-reduce.rst
Original file line number Diff line number Diff line change
Expand Up @@ -182,12 +182,13 @@
.. versionadded:: 2.1

Optional. Specify output operation as non-atomic and is
valid *only* for ``merge`` and ``reduce`` output modes.
valid *only* for ``merge`` and ``reduce`` output modes which
may take minutes to execute.

If ``nonAtomic`` is ``true``, the post-processing step will
prevent MongoDB from locking the database; however, other
clients will be able to read intermediate states of the
output database. Otherwise the map reduce operation must
output collection. Otherwise the map reduce operation must
lock the database during post-processing.

- **Output inline**. Perform the map-reduce operation in memory
Expand Down