Skip to content

Commit bdc1baf

Browse files
committed
Index comments
1 parent 5ae94fe commit bdc1baf

File tree

2 files changed

+111
-8
lines changed

2 files changed

+111
-8
lines changed

draft/administration/indexes.txt

Lines changed: 72 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ of the ``people`` collection:
4040

4141
db.people.ensureIndex( { phone-number: 1 } )
4242

43+
TODO: you need ""s around phone-number, otherwise it's invalid JS (phone minus number).
44+
4345
To create a :ref:`compound index <index-type-compound>`, use an
4446
operation that resembles the following prototype:
4547

@@ -60,13 +62,17 @@ collection:
6062
To build indexes for a :term:`replica set`, before version 2.2,
6163
see :ref:`index-building-replica-sets`.
6264

65+
TODO: I don't think anything changed about replica set index builds for 2.2...
66+
6367
.. [#ensure] As the name suggests, :func:`ensureIndex() <db.collection.ensureIndex()>`
6468
only creates an index if an index of the same specification does
6569
not already exist.
6670

6771
Sparse
6872
``````
6973

74+
TODO: Sparse? Maybe "Types of Indexes->Sparse"?
75+
7076
To create a :ref:`sparse index <index-type-sparse>` on a field, use an
7177
operation that resembles the following prototype:
7278

@@ -87,6 +93,12 @@ without the ``twitter_name`` field.
8793

8894
MongoDB cannot create sparse compound indexes.
8995

96+
TODO: is this true? I thought that it could.
97+
98+
TODO: Is there more doc on spare indexes somewhere? Seems like this is missing
99+
some info like getting different results back when the index is used, null
100+
counts as existing, etc.
101+
90102
Unique
91103
``````
92104

@@ -105,10 +117,14 @@ records for the same legal entity:
105117

106118
db.accounts.ensureIndex( { tax-id: 1 }, { unique: true } )
107119

120+
TODO: tax-id should be in ""s.
121+
108122
The :ref:`_id index <index-type-primary>` is a unique index. In some
109123
situations you may want to use the ``_id`` field for these primary
110124
data rather than using a unique index on another field.
111125

126+
TODO: "for these primary data"?
127+
112128
In many situations you will want to combine the ``unique`` constraint
113129
with the ``sparse`` option. When MongoDB indexes a field, if a
114130
document does not have a value for a field, the index entry for that
@@ -141,6 +157,8 @@ as in the following example:
141157

142158
db.accounts.dropIndex( { tax-id: 1 } )
143159

160+
TODO: ""s!
161+
144162
This will remove the index on the ``tax-id`` field in the ``accounts``
145163
collection. The shell provides the following document after completing
146164
the operation:
@@ -203,6 +221,12 @@ for this operation.
203221
To rebuild indexes for a :term:`replica set`, before version 2.2,
204222
see :ref:`index-rebuilding-replica-sets`.
205223

224+
TODO: again, this probably isn't different in 2.2
225+
226+
TODO: one thing that I would appreciate you mentioning is that some drivers may
227+
create indexes like {a : NumberLong(1)} _which is fine_ and doesn't break
228+
anything so stop complaining about it.
229+
206230
Special Creation Options
207231
~~~~~~~~~~~~~~~~~~~~~~~~
208232

@@ -211,6 +235,8 @@ Special Creation Options
211235
TTL collections use a special ``expire`` index option. See
212236
:doc:`/tutorial/expire-data` for more information.
213237

238+
TODO: Are 2d indexes getting a mention?
239+
214240
Background
215241
``````````
216242

@@ -222,11 +248,25 @@ prototype invocation of :func:`db.collection.ensureIndex()`:
222248

223249
db.collection.ensureIndex( { a: 1 }, { background: true } )
224250

251+
TODO: what does it mean to build an index in the background? You might want to
252+
mention:
253+
* performance implications
254+
* that this type of index build can be killed
255+
* that this blocks the connection you sent the ensureindex on, but ops from
256+
other connections can proceed in
257+
* that indexes are created on the foreground on secondaries in 2.0,
258+
which blocks replication & slave reads. In 2.2, it does not block reads (but
259+
still blocks repl).
260+
225261
Drop Duplicates
226262
```````````````
227263

228264
To force the creation of a :ref:`unique index <index-type-unique>`
229-
index, you can use the ``dropDups`` option. This will force MongoDB to
265+
index
266+
267+
TODO: " on a collection with duplicate values in the field to be indexed "
268+
269+
you can use the ``dropDups`` option. This will force MongoDB to
230270
create a *unique* index by deleting documents with duplicate values
231271
when building the index. Consider the following prototype invocation
232272
of :func:`db.collection.ensureIndex()`:
@@ -243,12 +283,15 @@ See the full documentation of :ref:`duplicate dropping
243283
Specifying ``{ dropDups: true }`` will delete data from your
244284
database. Use with extreme caution.
245285

286+
TODO: I'd say it "may" delete data from your DB, not like it's going to go all
287+
Shermanesque on your data.
288+
246289
.. _index-building-replica-sets:
247290

248291
Building Indexes on Replica Sets
249292
--------------------------------
250293

251-
.. versionchanged:: 2.2
294+
.. versionchanged:: 2.2
252295
Index rebuilding operations on :term:`secondary` members of
253296
:term:`replica sets <replica set>` now run as normal background
254297
index operations. Run :func:`ensureIndex()
@@ -257,20 +300,30 @@ Building Indexes on Replica Sets
257300
the following operation to isolate and control the impact of
258301
indexing building operations on a set as a whole.
259302

303+
TODO: I think there needs to be a huge mention that this still blocks
304+
replication, so the procedure below is recommended.
305+
260306
.. admonition:: For Version 1.8 and 2.0
261307

262308
:ref:`Background index creation operations
263309
<index-creation-background>` became *foreground* indexing
264310
operations on :term:`secondary` members of replica sets. These
265311
foreground operations will block all replication on the
266-
secondaries, and can impact performance of the entire set. To build
312+
secondaries,
313+
314+
TODO: and don't allow any reads to go through.
315+
316+
and can impact performance of the entire set. To build
267317
indexes with minimal impact on a replica set, use the following
268318
procedure for all non-trivial index builds:
269319

270320
#. Stop the :program:`mongod` process on one secondary. Restart the
271-
:program:`mongod` process *without* the :option:`--replSet <mongod --replSet>`
321+
:program:`mongod` process *without* the :option:`--replSet <mongod --replSet>`
272322
option. This instance is now in "standalone" mode.
273323

324+
TODO: generally we recommend running it on a different port, too, so that apps
325+
& other servers in the set don't try to contact it.
326+
274327
#. Create the new index or rebuild the index on this :program:`mongod`
275328
instance.
276329

@@ -287,7 +340,7 @@ Building Indexes on Replica Sets
287340

288341
Ensure that your :ref:`oplog` is large enough to permit the
289342
indexing or re-indexing operation to complete without falling
290-
too far behind to catch up. See the ":ref:`replica-set-oplog-sizing`"
343+
too far behind to catch up. See the ":ref:`replica-set-oplog-sizing`"
291344
documentation for additional information.
292345

293346
.. note::
@@ -301,6 +354,9 @@ Building Indexes on Replica Sets
301354
For the best results, always create indexes *before* you begin
302355
inserting data into a collection.
303356

357+
TODO: well, sort of. That'll build the indexes fast, but make the inserts
358+
slower. Overall, it's faster to insert data, then build indexes.
359+
304360
Measuring Index Use
305361
-------------------
306362

@@ -318,7 +374,12 @@ following tools:
318374
- :func:`cursor.hint()`
319375

320376
Append the :func:`hint() <cursor.hint()>` to any cursor (e.g.
321-
query) with the name of an index as the argument to *force* MongoDB
377+
query) with the name
378+
379+
TODO: this isn't "the name of an index." I'd say just "with the index." The
380+
name of an index is a string like "zipcode_1".
381+
382+
of an index as the argument to *force* MongoDB
322383
to use a specific index to fulfill the query. Consider the following
323384
example:
324385

@@ -331,8 +392,13 @@ following tools:
331392
<cursor.explain()>` in conjunction with each other to compare the
332393
effectiveness of a specific index.
333394

395+
TODO: mention $natural to force no index usage?
396+
334397
- :status:`indexCounters`
335398

336399
Use the :status:`indexCounters` data in the output of
337400
:dbcommand:`serverStatus` for insight into database-wise index
338401
utilization.
402+
403+
TODO: I'd like to see this also cover how to track how far an index build has
404+
gotten and how to kill an index build.

draft/applications/indexes.txt

Lines changed: 39 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,8 @@ database. To use a covered index you must:
3838
- in the :term:`projection`, explicitly exclude the ``_id`` field from
3939
the result set, unless the index includes ``_id``.
4040

41+
TODO: the third point seems like part of the first point.
42+
4143
Use the :func:`explain() <cursor.explain()>` to test the query. If
4244
MongoDB was able to use a covered index, then the value of the
4345
``indexOnly`` field will be ``true``.
@@ -49,7 +51,12 @@ disk, and indexes are smaller than the documents they catalog.
4951
Sort Using Indexes
5052
~~~~~~~~~~~~~~~~~~
5153

52-
While the :dbcommand:`sort` database command and the :func:`sort()
54+
While the :dbcommand:`sort` database command
55+
56+
TODO: sort database command? Is "database command" being used in a different
57+
sense here?
58+
59+
and the :func:`sort()
5360
<cursor.sort()>` helper support in-memory sort operations without the
5461
use of an index, these operations are:
5562

@@ -77,6 +84,9 @@ results. For example:
7784
When using compound indexes to support sort operations, the sorted
7885
field must be the *last* field in the index.
7986

87+
TODO: not true! In 2.2, you can use, say, the index above for a query on
88+
username, sort by status, too.
89+
8090
Store Indexes in Memory
8191
~~~~~~~~~~~~~~~~~~~~~~~
8292

@@ -124,6 +134,8 @@ deep understanding of:
124134

125135
MongoDB can only use *one* index to support any given operation.
126136

137+
TODO: trickily put. I hope you menion $or elsewhere?
138+
127139
Selectivity
128140
~~~~~~~~~~~
129141

@@ -145,9 +157,22 @@ with fulfilling the query.
145157
these values using the index, MongoDB will only need to scan a very
146158
small number of documents to fulfill the rest of the query.
147159

160+
TODO: It'd be clearer to use "real" numbers in the second example, too, but I
161+
think you'd have to re-jigger the example to do so.
162+
148163
To ensure optimal performance, use indexes that are maximally
149164
selective relative to your queries.
150165

166+
TODO: the example makes selectivity sound like the uniqueness of the index,
167+
which isn't the whole story. Having something like {x:{$gt:3}} that matches 60%
168+
of the collection isn't very selective, even if x has a unique index on it.
169+
170+
I think it's important to emphasize that selectivity is whittling down possible
171+
results to as small a % as possible.
172+
173+
TODO: Also, might be worth mentioning that, if you cannot get selectivity low
174+
enough, indexes will actually be slower than table scans.
175+
151176
Insert Throughput
152177
~~~~~~~~~~~~~~~~~
153178

@@ -156,20 +181,28 @@ Insert Throughput
156181
.. TODO fact check
157182

158183
MongoDB must update all indexes associated with a collection following
159-
every insert or update operation. Every index on a collection adds
184+
every insert or update operation.
185+
186+
TODO: or delete, too
187+
188+
Every index on a collection adds
160189
some amount of overhead to these operations. In almost every case, the
161190
performance gains that indexes realize for read operations are worth
162191
the insertion penalty; however:
163192

164193
- in some cases, an index to support an infrequent query may incur
165194
more insert-related costs, than saved read-time.
166195

196+
TODO: rm comma: "insert-related costs than saved read-time"
197+
167198
- in some situations, if you have many indexes on a collection with a
168199
high insert throughput and a number of very similar indexes, you may
169200
find better overall results by using a slightly less effective index
170201
on some queries if it means consolidating the total number of
171202
indexes.
172203

204+
TODO: do you cover what indexes overlap?
205+
173206
Index Size
174207
~~~~~~~~~~
175208

@@ -182,9 +215,13 @@ index to locate those documents, MongoDB can maintain a much smaller
182215
- all of your indexes use less space than the documents in the
183216
collection.
184217

218+
TODO: individually or all together?
219+
185220
- the indexes and a reasonable working set can fit RAM at the same
186221
time.
187222

223+
TODO: a reasonable working set?
224+
188225
.. _indexing-right-handed:
189226

190227
Indexes do not have to fit *entirely* into RAM in all cases. If the

0 commit comments

Comments
 (0)