Skip to content

made edits, but left some TODOs for you #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 27, 2012
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 54 additions & 48 deletions draft/core/indexes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,34 +7,37 @@ Index Overview
Synopsis
--------

Indexes are an internal representation of the documents in your
database organized so that MongoDB can use them to quickly locate
documents and fulfill queries very efficiently. Fundamentally, indexes
An index is a data structure that allows you to quickly locate documents
based on the values stored in certain specified fields. Fundamentally, indexes
in MongoDB are similar to indexes in other database systems. MongoDB
supports indexes on any field or sub-field contained in documents
within a MongoDB collection. Consider the following core features of
indexes:

- MongoDB defines indexes on a per-:term:`collection` level.

- Every query (including update operations,) can use one and only one
index. The query optimizer determines, empirically, the best query
plan and indexes to use on a specific query, but can be overridden
using the :func:`cursor.hint()` method. However, :ref:`compound
indexes <index-type-compound>` make it possible to include multiple
fields in a single index.

- Indexes often dramatically increase the performance of queries;
however, each index creates a slight overhead for every write
operation.

- Every query (including update operations) use one and only one
index. The query optimizer determines which index to use
empirically, by occasionally running multiple query plans,
and tracking the most performant index for each query type.
The query optimizer's choice can be overridden
using the :func:`cursor.hint()` method.

- Indexes can be created over a single field, or multiple fields using a
:ref:`compound index <index-type-compound>`.

- Queries that are "covered" by the index return more quickly
than documents that have to scan many individual documents.
than queries that have to scan many individual documents. An index
"covers" a query if all the data that the query must return
is stored in within the keys of the index.

- By using queries with good index coverage, it possible for MongoDB
to only store the index itself and the most often used documents in
memory, which can maximize database capacity, performance and
throughput.
- Using queries with good index coverage will reduce the number of full
documents that MongoDB needs to store in memory, thus maximizing database
performance and throughput.

Continue reading for a complete overview of indexes in MongoDB,
including the :ref:`types of indexes <index-types>`, basic
Expand Down Expand Up @@ -66,7 +69,7 @@ _id

The ``_id`` index is a :ref:`unique index <index-type-unique>`
[#unique-index-report]_ on the ``_id`` field, and MongoDB creates this
index by default on all collections. [#capped-collections]_ You cannot
index by default on all collections (except for [#capped-collections]). You cannot
delete the index on ``_id``.

The ``_id`` field is the :term:`primary key` for the collection, and
Expand All @@ -77,17 +80,20 @@ are 12-byte, unique identifiers, that make suitable ``_id`` values.

.. note::

In :term:`shard clusters <shard cluster>`, if the you do *not* use
In :term:`shard clusters <shard cluster>`, if you do *not* use
the ``_id`` field as the :term:`shard key`, then your application
**must** ensure the uniqueness of the values in the ``_id`` field
to prevent errors.
to prevent errors. This is most-often done by using the standard
auto-generated :term:`ObjectIds`.

.. [#unique-index-report] Although the index on ``_id`` *is* unique,
the :func:`getIndexes() <db.collection.getIndexes()>` method will
*not* print ``unique: true`` in the :program:`mongo` shell.

.. [#capped-collections] Capped collections are a special collection
which do not have an ``_id`` index.
.. [#capped-collections] Capped collections are special collections
which do not have an ``_id`` index by default.
TODO: figure out what new behavior of capped collections is.
(i think in replset capped collections now have _id by default)

.. _index-types-secondary:

Expand All @@ -106,9 +112,9 @@ primary, common, and user-facing queries and require MongoDB to scan
the fewest number of documents possible.

To create a secondary index, use the :func:`ensureIndex()`
method. The specifications an index using the :func:`ensureIndex()
<db.collection.ensureIndex()>` operation will resemble the following
on the MongoDB shell:
method. The argument to :func:`ensureIndex()
<db.collection.ensureIndex()>` will resemble the following
in the MongoDB shell:

.. code-block:: javascript

Expand Down Expand Up @@ -139,7 +145,7 @@ documents that resemble the following example document:
}
}

You could create an index on the ``address.zipcode`` field, using the
You can create an index on the ``address.zipcode`` field, using the
following specification:

.. code-block:: javascript
Expand All @@ -157,7 +163,7 @@ Compound Indexes
MongoDB supports "compound indexes," where a single index structure
holds references to multiple fields within a collection's
documents. Consider the collection ``products`` that holds documents
that resemble the following an example document:
that resemble the following example document:

.. code-block:: javascript

Expand All @@ -178,21 +184,24 @@ specify a single compound index to support both of these queries:

db.products.ensureIndex( { "item": 1, "stock": 1 } )

Note that that order of the fields in a compound index is very important.
Intuitively, the index above contains references to the documents sorted by
``item``, and within each item, sorted by ``stock``.
MongoDB will be able to use this index to support queries that select
the ``item`` field as well as those queries that select the ``item``
field **and** the ``stock`` field. However, these indexes will not
support queries that select *only* the ``stock`` field.
field **and** the ``stock`` field. However, this index will not
be useful for queries that select *only* the ``stock`` field.

Ascending and Descending
````````````````````````

Indexes store references to fields in either ascending or descending
order. The order of keys often doesn't matter because MongoDB can
transverse the index in either direction. However, in compound
indexes, for some kinds of sort operations, it's useful to have the
fields running in opposite order.
order. For single-field indexes, the order of keys doesn't matter,
because MongoDB can traverse the index in either direction. However, for
compound indexes, it is occasionally useful to have the fields running in
opposite order relative to each other.

To specify an index with an ascending order, use the following form:
To specify an index with a descending order, use the following form:

.. code-block:: javascript

Expand All @@ -207,6 +216,9 @@ following:
db.products.ensureIndex( { "field0": 1, "field1": -1 } )

.. TODO understand the sort operations better.
.. TODO Kevin's note: a good example here might be an index on
{"username" : 1, "timestamp" : -1} which would be helpful for listing
event history (most recent first) for all users (alphabetically).

.. _index-types-multikey:

Expand Down Expand Up @@ -236,7 +248,7 @@ following form:
]
}

An index on the ``comments`` field would be a multikey index, and will
An index on the ``comments.text`` field would be a multikey index, and will
add items to the index for all of the sub-documents in the array. As a
result you will be able to run the following query, using only the
index to locate the document:
Expand All @@ -245,13 +257,6 @@ index to locate the document:

db.feedback.find( { "comments.text": "Please expand the olive selection." } )

The following operators are useful for interacting with arrays, like
the ones that you would index using multikey indexes.

- :operator:`$addToSet`
- :operator:`$push`
- :operator:`$pull`
- :operator:`$all`

.. warning::

Expand All @@ -266,8 +271,8 @@ the ones that you would index using multikey indexes.
Unique Index
~~~~~~~~~~~~

The unique index will cause MongoDB to reject all documents that
contain a duplicate value for the index field. To create a unique index
A unique index will cause MongoDB to reject all documents that
contain a duplicate value for the indexed field. To create a unique index
on the ``user_id`` field of the ``members`` collection, use the
following operation in the :program:`mongo` shell:

Expand Down Expand Up @@ -312,8 +317,9 @@ the :program:`mongo` shell:

.. note::

Sparse indexes are not `block-level`_ indexes. Think of them as
dense indexes with a specific filter.
Sparse indexes in MongoDB are not to be confused with `block-level`_
indexes in other databases. Think of them as dense indexes with a
specific filter.

You can combine the sparse index option with the :ref:`unique
indexes <index-type-unique>` option so that :program:`mongod` will
Expand Down Expand Up @@ -351,7 +357,7 @@ By default, creating an index is a blocking operation. Building an
index on a large collection of data, the operation can take a long
time to complete. To resolve this issue, the background option can
allow you to continue to use your :program:`mongod` instance during
the index build. Create an index in the background of the ``zipcide``
the index build. Create an index in the background of the ``zipcode``
field of the ``people`` collection using a command that resembles the
following:

Expand Down Expand Up @@ -412,8 +418,7 @@ construction:
:dbcommand:`compact` will not run concurrently with a background
index build.

Queries will not use these indexes until the index build is complete
because the index builds in the ``system.indexes`` database.
Queries will not use these indexes until the index build is complete.

.. _index-creation-duplicate-dropping:

Expand Down Expand Up @@ -535,6 +540,7 @@ data.

.. TODO insert link to special /core/geospatial.txt documentation
on this topic. once that document exists.
-- TODO short mention of geoHaystack indexes here?

Index Limitations
-----------------
Expand All @@ -543,7 +549,7 @@ Be aware of the following current limitations of MongoDB's indexes:

- A collection may have no more than :ref:`64 indexes <limit-number-of-indexes-per-collection>`.

- Indexed items can have no more than :ref:`1024 bytes <limit-index-size>`.
- Index keys can be no larger than :ref:`1024 bytes <limit-index-size>`.

This includes the field value or values, the field name or names,
and the :term:`namespace`.
Expand Down