@@ -50,42 +50,49 @@ Text Indexes
50
50
51
51
.. note::
52
52
53
- The ``text`` index type is currently an experimental feature and
54
- you must enable it at run time. Interfaces and on-disk format may
55
- change in future releases.
53
+ The ``text`` index type is currently an experimental feature.
54
+ Interfaces and on-disk format may change in future releases. To use
55
+ ``text`` index, you need to enable it at run time. Do **not** enable
56
+ or use ``text`` indexes on production systems.
56
57
57
58
Background
58
59
``````````
59
60
60
- MongoDB2.3.2 includes a new ``text`` index type. ``text`` indexes
61
- support boolean text search queries. Any set of fields containing
62
- string data may be text indexed. You may only maintain a single
63
- ``text`` index per collection. ``text`` indexes are fully consistent
64
- and updated in real-time as applications insert, update, or delete
65
- documents from the database. The ``text`` index and query system
66
- supports language specific stemming and stop-words. Additionally:
61
+ MongoDB 2.3.2 includes a new ``text`` index type. ``text`` indexes
62
+ support boolean text search queries:
67
63
68
- - indexes and queries drop stop words (i.e. "the," "an," "a," "and,"
69
- etc.)
64
+ - Any set of fields containing string data may be text indexed.
70
65
71
- - MongoDB stores words stemmed during insertion in the index, using
72
- simple suffix stemming, including support for a number of
73
- languages. MongoDB automatically stems :dbcommand:`text` queries at
74
- before beginning the query.
66
+ - You may only maintain a **single** ``text`` index per collection.
67
+
68
+ - ``text`` indexes are fully consistent and updated in real-time as
69
+ applications insert, update, or delete documents from the database.
70
+
71
+ - The ``text`` index and query system supports language specific
72
+ stemming and stop words. Additionally:
73
+
74
+ - Indexes and queries drop stop words (i.e. "the," "an," "a," "and,"
75
+ etc.)
76
+
77
+ - MongoDB stores words stemmed during insertion in the index, using
78
+ simple suffix stemming, including support for a number of
79
+ languages. MongoDB automatically stems :dbcommand:`text` queries at
80
+ before beginning the query.
75
81
76
82
However, ``text`` indexes have large storage requirements and incur
77
83
**significant** performance costs:
78
84
79
- - Text indexes can be large. They contain one index entry for each
80
- unique word indexed for each document inserted.
85
+ - Text indexes can be large. They contain one index entry for each
86
+ unique post-stemmed word in each indexed field for each document
87
+ inserted.
81
88
82
89
- Building a ``text`` index is very similar to building a large
83
90
multi-key index, and therefore may take longer than building a
84
- simple ordered (scalar)index.
91
+ simple ordered (scalar) index.
85
92
86
93
- ``text`` indexes will impede insertion throughput, because MongoDB
87
- must add an index entry for each unique word in each indexed field
88
- of each new source document.
94
+ must add an index entry for each unique post-stemmed word in each
95
+ indexed field of each new source document.
89
96
90
97
- some :dbcommand:`text` searches may affect performance on your
91
98
:program:`mongod`, particularly for negation queries and phrase
@@ -103,11 +110,11 @@ indexes have the following limitations and behaviors:
103
110
- MongoDB does not stem phrases or negations in :dbcommand:`text`
104
111
queries.
105
112
106
- - the index is case insensitive.
113
+ - the index is case- insensitive.
107
114
108
115
- a collection may only have a single ``text`` index at a time.
109
116
110
- .. important:: Do not enable or use ``text`` indexes on production
117
+ .. important:: Do ** not** enable or use ``text`` indexes on production
111
118
systems.
112
119
113
120
.. May be worth including this:
@@ -120,21 +127,25 @@ indexes have the following limitations and behaviors:
120
127
Test ``text`` Indexes
121
128
`````````````````````
122
129
123
- .. important:: The ``text`` index type is an experimental feature and
124
- you must enable the feature before creating or accessing a text
125
- index. To enable text indexes, issue the following command at the
126
- :program:`mongo` shell:
130
+ The ``text`` index type is an experimental feature and you need to
131
+ enable the feature before creating or accessing a text index.
132
+
133
+ To enable text indexes, issue the following command in the
134
+ :program:`mongo` shell:
127
135
128
- .. code-block:: javascript
136
+ .. important:: Do **not** enable or use ``text`` indexes on production
137
+ systems.
129
138
130
- db.adminCommand( { setParameter: 1, textSearchEnabled: true } )
139
+ .. code-block:: javascript
131
140
132
- You can also start the :program:`mongod` with the following
133
- invocation:
141
+ db.adminCommand( { setParameter: 1, textSearchEnabled: true } )
134
142
135
- .. code-block:: sh
143
+ You can also start the :program:`mongod` with the following
144
+ invocation:
136
145
137
- mongod --setParameter textSearchEnabled=true
146
+ .. code-block:: sh
147
+
148
+ mongod --setParameter textSearchEnabled=true
138
149
139
150
Create Text Indexes
140
151
^^^^^^^^^^^^^^^^^^^
@@ -146,9 +157,12 @@ To create a ``text`` index, use the following invocation of
146
157
147
158
db.collection.ensureIndex( { content: "text" } )
148
159
149
- ``text`` indexes catalog all string data in the ``content`` field. Your
150
- ``text`` index can include content from multiple fields, or arrays,
151
- and from fields in sub-documents, as in the following:
160
+ This ``text`` index catalogs all string data in the ``content`` field
161
+ where the ``content`` field contains a string or an array of string
162
+ elements. To index fields in sub-documents, you need to specify the
163
+ individual fields from the sub-documents using the :term:`dot
164
+ notation`. A ``text`` index can include multiple fields, as in the
165
+ following:
152
166
153
167
.. code-block:: javascript
154
168
@@ -157,7 +171,7 @@ and from fields in sub-documents, as in the following:
157
171
"users.profiles": "text" } )
158
172
159
173
The default name for the index consists of the ``<field name>``
160
- concatenated with ``_text``, as in the following:
174
+ concatenated with ``_text`` for the indexed fields , as in the following:
161
175
162
176
.. code-block:: javascript
163
177
@@ -193,19 +207,22 @@ sub-documents. Furthermore, the ``content`` field has a weight of 1 and
193
207
the ``users.profiles`` field has a weight of 2.
194
208
195
209
You can add a conventional ascending or descending index field(s) as a
196
- prefix or suffix of the index so that queries can limit the number of
197
- index entries the query must review to perform the query. You cannot
198
- include :ref:`multi-key <index-type-multi-key>` index field nor
199
- :ref:`geospatial <index-feature-geospatial>` index field.
210
+ prefix or suffix of the index. You cannot include :ref:`multi-key
211
+ <index-type-multi-key>` index field nor :ref:`geospatial
212
+ <index-feature-geospatial>` index field.
200
213
201
214
If you create an ascending or descending index as a prefix of a
202
215
``text`` index:
203
216
204
217
- MongoDB will only index documents that have the prefix field
205
218
(i.e. ``username``) and
206
219
207
- - All :dbcommand:`text` queries using this index must specify the
208
- prefix field in the ``filter`` query.
220
+ - The :dbcommand:`text` query can limit the number of index entries to
221
+ review in order to perform the query.
222
+
223
+ - All :dbcommand:`text` queries using this index must include the
224
+ ``filter`` option that specifies an equality condition for the prefix
225
+ field or fields.
209
226
210
227
Create this index with the following operation:
211
228
@@ -295,8 +312,15 @@ cursor.
295
312
:param string search:
296
313
297
314
A text string that MongoDB stems and uses to query the ``text``
298
- index. When specifying phrase matches, you must escape quote
299
- characters as ``\"``.
315
+ index. In the :program:`mongo` shell, to specify a phrase to
316
+ match, you can either:
317
+
318
+ - enclose the phrase in escaped double quote characters
319
+ (``\"<phrase>\"``) within the ``search`` string, as in
320
+ ``"\"coffee table\""``, or
321
+
322
+ - enclose the phrase in single quote characters, as in ``"'coffee
323
+ table'"``
300
324
301
325
:param document filter:
302
326
@@ -318,19 +342,20 @@ cursor.
318
342
:param number limit:
319
343
320
344
Optional. Specify the maximum number of documents to include in
321
- the response.
345
+ the response. The default limit is 100.
322
346
323
347
:param string language:
324
348
325
349
Optional. Specify the language that determines the tokenization,
326
- stemming, and the stop words for the search.
350
+ stemming, and the stop words for the search. The default language
351
+ is english.
327
352
328
353
:return:
329
354
330
- :dbcommand:`text` returns results in the form of a
331
- document. Results must fit within the :limit:`BSON Document
332
- Size`. Use a projection setting to limit the size of the result
333
- set.
355
+ :dbcommand:`text` returns results in the form of a document.
356
+ Results must fit within the :limit:`BSON Document Size`. Use the
357
+ ``limit`` and the `` projection`` parameters to limit the size of
358
+ the result set.
334
359
335
360
The implicit connector between the terms of a multi-term search is a
336
361
disjunction (``OR``). Search for ``"first second"`` searches
@@ -367,20 +392,20 @@ cursor.
367
392
368
393
db.collection.runCommand( "text", { search: "search" } )
369
394
370
- This query returns documents that contain the word
371
- ``search``, case-insensitive, in the ``content`` field.
372
-
395
+ This query returns documents that contain the word ``search``,
396
+ case-insensitive, in the ``content`` field.
397
+
373
398
#. Search for multiple words, ``create`` or ``search`` or ``fields``:
374
-
399
+
375
400
.. code-block:: javascript
376
-
401
+
377
402
db.collection.runCommand( "text", { search: "create search fields" } )
378
-
403
+
379
404
This query returns documents that contain the either ``create``
380
405
**or** ``search`` **or** ``field`` in the ``content`` field.
381
-
406
+
382
407
#. Search for the exact phrase ``create search fields``:
383
-
408
+
384
409
.. code-block:: javascript
385
410
386
411
db.collection.runCommand( "text", { search: "\"create search fields\"" } )
@@ -397,7 +422,7 @@ cursor.
397
422
398
423
Use the ``-`` as a prefix to terms to specify negation in the
399
424
search string. The query returns documents that contain the
400
- either ``creat `` **or** ``search``, but **not** ``field``, all
425
+ either ``create `` **or** ``search``, but **not** ``field``, all
401
426
case-insensitive, in the ``content`` field. Prefixing a word
402
427
with a hyphen (``-``) negates a word:
403
428
@@ -407,8 +432,8 @@ cursor.
407
432
- A ``<search string>`` that only contains negative words returns no match.
408
433
409
434
- A hyphenated word, such as ``case-insensitive``, is not a
410
- negation. The :dbcommand:`text` command treats the hyphen and
411
- as a delimiter.
435
+ negation. The :dbcommand:`text` command treats the hyphen as a
436
+ delimiter.
412
437
413
438
#. Search for a single word ``search`` with an additional ``filter`` on
414
439
the ``about`` field, but **limit** the results to 2 documents with the
@@ -424,16 +449,17 @@ cursor.
424
449
projection: { comments: 1, _id: 0 }
425
450
}
426
451
)
427
-
452
+
428
453
- The ``filter`` :ref:`query document <mongodb-query-document>`
429
- is uses a :operator:`regular expression <$regex>`. See the
430
- :ref :`query operators <operator>` page for available query
454
+ uses a :operator:`regular expression <$regex>`. See the
455
+ :doc :`query operators </reference/ operator>` page for available query
431
456
operators.
432
-
433
- - The ``projection`` must explicitly exclude (``0``) the ``_id``
434
- field. Within the ``projection`` document, you cannot mix
435
- inclusions (i.e. ``<fieldA>: 1``) and exclusions (i.e. ``<fieldB>:
436
- 0``), except for the ``_id`` field.
457
+
458
+ - Because the ``_id`` field is implicitly included, in order to
459
+ return **only** the ``comments`` field, you must explicitly
460
+ exclude (``0``) the ``_id`` field. Within the ``projection``
461
+ document, you cannot mix inclusions (i.e. ``<fieldA>: 1``) and
462
+ exclusions (i.e. ``<fieldB>: 0``), except for the ``_id`` field.
437
463
438
464
Additional Authentication Features
439
465
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
0 commit comments