Skip to content

Commit bd6e654

Browse files
jason-price-mongodbjason-price-mongodb
andauthored
DOCSP-17644 clustered indexes (#802)
* DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * Implemented review comment changes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes * DOCSP-17644-clustered-indexes Co-authored-by: jason-price-mongodb <[email protected]>
1 parent d720615 commit bd6e654

14 files changed

+421
-24
lines changed

config/redirects

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1803,6 +1803,7 @@ raw: ${prefix}/manual/core/wildcard -> ${base}/manual/core/index-wildcard/
18031803
[v3.6-v4.4]: ${prefix}/${version}/reference/operator/aggregation/setWindowFields/ -> ${base}/${version}/reference/operator/aggregation/
18041804
[v3.6-v4.4]: ${prefix}/${version}/reference/operator/aggregation/shift/ -> ${base}/${version}/reference/operator/aggregation/
18051805
[v3.6-v4.4]: ${prefix}/${version}/reference/versioned-api/ -> ${base}/${version}/reference/
1806+
[v3.6-v4.4]: /${version}/core/clustered-collections/ -> ${base}/${version}/core/databases-and-collections/
18061807

18071808
#
18081809
# Redirects for 5.0 and greater (if pages are removed in 4.4 that used to exist in earlier versions)

source/core/clustered-collections.txt

Lines changed: 259 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
.. _clustered-collections:
2+
3+
=====================
4+
Clustered Collections
5+
=====================
6+
7+
.. default-domain:: mongodb
8+
9+
.. contents:: On this page
10+
:local:
11+
:backlinks: none
12+
:depth: 1
13+
:class: singlecol
14+
15+
.. versionadded:: 5.3
16+
17+
Overview
18+
--------
19+
20+
.. include:: /includes/clustered-collections-introduction.rst
21+
22+
Benefits
23+
--------
24+
25+
Because clustered collections store documents ordered by the
26+
:ref:`clustered index <db.createCollection.clusteredIndex>` key value,
27+
clustered collections have the following benefits compared to
28+
non-clustered collections:
29+
30+
- Faster queries on clustered collections without needing a secondary
31+
index, such as queries with range scans and equality comparisons on
32+
the clustered index key.
33+
34+
- Clustered collections have a lower storage size, which improves
35+
performance for queries and bulk inserts.
36+
37+
- Clustered collections can eliminate the need for a secondary :ref:`TTL
38+
(Time To Live) index <ttl-index>`.
39+
40+
- A clustered index is also a TTL index if you specify the
41+
:ref:`expireAfterSeconds <db.createCollection.expireAfterSeconds>`
42+
field.
43+
44+
- To be used as a TTL index, the ``_id`` field must be a supported
45+
date type. See :ref:`index-feature-ttl`.
46+
47+
- If you use a clustered index as a TTL index, it improves document
48+
delete performance and reduces the clustered collection storage
49+
size.
50+
51+
- Clustered collections have additional performance improvements for
52+
inserts, updates, deletes, and queries.
53+
54+
- All collections have an :ref:`_id index <index-type-id>`.
55+
56+
- A non-clustered collection stores the ``_id`` index separately from
57+
the documents. This requires two writes for inserts, updates, and
58+
deletes, and two reads for queries.
59+
60+
- A clustered collection stores the index and the documents together
61+
in ``_id`` value order. This requires one write for inserts,
62+
updates, and deletes, and one read for queries.
63+
64+
Behavior
65+
--------
66+
67+
Clustered collections store documents ordered by the :ref:`clustered
68+
index <db.createCollection.clusteredIndex>` key value.
69+
70+
You can only have one clustered index in a collection because the
71+
documents can be stored in only one order. Only collections with a
72+
clustered index store the data in sorted order.
73+
74+
You can have a clustered index and add :term:`secondary indexes
75+
<secondary index>` to a clustered collection. Clustered indexes differ
76+
from secondary indexes:
77+
78+
- A clustered index can only be created when you create the collection.
79+
80+
- The clustered index keys are stored with the collection. The
81+
collection size returned by the :dbcommand:`collStats` command
82+
includes the clustered index size.
83+
84+
Limitations
85+
-----------
86+
87+
Clustered collection limitations:
88+
89+
- You cannot transform a non-clustered collection to a clustered
90+
collection, or the reverse. Instead, you can:
91+
92+
- Read documents from one collection and write them to another
93+
collection using an :ref:`aggregation pipeline
94+
<aggregation-pipeline-intro>` with an :pipeline:`$out` stage or
95+
a :pipeline:`$merge` stage.
96+
97+
- Export collection data with :binary:`~bin.mongodump` and import the
98+
data into another collection with :binary:`~bin.mongorestore`.
99+
100+
- By default, if a :term:`secondary index <secondary index>` exists on
101+
a clustered collection and the secondary index is usable by your
102+
query, the secondary index is selected instead of the clustered
103+
index.
104+
105+
- You must provide a hint to use the clustered index because it
106+
is not automatically selected by the :doc:`query optimizer
107+
</core/query-plans>`.
108+
109+
- The :ref:`clustered index <db.createCollection.clusteredIndex>` is
110+
not automatically used by the query optimizer if a usable secondary
111+
index exists.
112+
113+
- When a query uses a clustered index, it will perform a
114+
:term:`bounded collection scan`.
115+
116+
- The clustered index key must be on the ``_id`` field.
117+
118+
- You cannot hide a clustered index. See :doc:`Hidden indexes
119+
</core/index-hidden>`.
120+
121+
- If there are secondary indexes for the clustered collection, the
122+
collection has a larger storage size. This is because secondary
123+
indexes on a clustered collection with large clustered index keys may
124+
have a larger storage size than secondary indexes on a non-clustered
125+
collection.
126+
127+
.. _clustered-collections-clustered-index-key-values:
128+
129+
Set Your Own Clustered Index Key Values
130+
---------------------------------------
131+
132+
By default, the :ref:`clustered index
133+
<db.createCollection.clusteredIndex>` key values are the unique document
134+
:ref:`object identifiers <objectid>`.
135+
136+
You can set your own clustered index key values. Your key:
137+
138+
- Must contain unique values.
139+
140+
- Must be immutable.
141+
142+
- Should contain sequentially increasing values. This is not a
143+
requirement but improves insert performance.
144+
145+
- Should be as small in size as possible.
146+
147+
- A clustered index supports keys up to 8 MB in size, but a much
148+
smaller clustered index key is best.
149+
150+
- A large clustered index key causes the clustered collection to
151+
increase in size and secondary indexes are also larger. This reduces
152+
the performance and storage benefits of the clustered collection.
153+
154+
- Secondary indexes on clustered collections with large clustered
155+
index keys may use more space compared to secondary indexes on
156+
non-clustered collections.
157+
158+
Examples
159+
--------
160+
161+
This section shows clustered collection examples.
162+
163+
``Create`` Example
164+
~~~~~~~~~~~~~~~~~~
165+
166+
.. include:: /includes/create-clustered-collection-example.rst
167+
168+
``db.createCollection`` Example
169+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
170+
171+
.. include:: /includes/db-create-clustered-collection-example.rst
172+
173+
Date Clustered Index Key Example
174+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
175+
176+
The following :dbcommand:`create` example adds a clustered collection
177+
named ``orders``:
178+
179+
.. code-block:: javascript
180+
181+
db.createCollection(
182+
"orders",
183+
{ clusteredIndex: { "key": { _id: 1 }, "unique": true, "name": "orders clustered key" } }
184+
)
185+
186+
In the example, :ref:`clusteredIndex
187+
<db.createCollection.clusteredIndex>` specifies:
188+
189+
.. |clustered-index-name| replace:: ``"name": "orders clustered key"``
190+
191+
.. include:: /includes/clustered-index-example-fields.rst
192+
193+
The following example adds documents to the ``orders`` collection:
194+
195+
.. code-block:: javascript
196+
197+
db.orders.insertMany( [
198+
{ _id: ISODate( "2022-03-18T12:45:20Z" ), "quantity": 50, "totalOrderPrice": 500 },
199+
{ _id: ISODate( "2022-03-18T12:47:00Z" ), "quantity": 5, "totalOrderPrice": 50 },
200+
{ _id: ISODate( "2022-03-18T12:50:00Z" ), "quantity": 1, "totalOrderPrice": 10 }
201+
] )
202+
203+
The ``_id`` :ref:`clusteredIndex <create.clusteredIndex>` key stores the
204+
order date.
205+
206+
If you use the ``_id`` field in a range query, performance is improved.
207+
For example, the following query uses ``_id`` and :expression:`$gt` to
208+
return the orders where the order date is greater than the supplied
209+
date:
210+
211+
.. code-block:: javascript
212+
213+
db.orders.find( { _id: { $gt: ISODate( "2022-03-18T12:47:00.000Z" ) } } )
214+
215+
Example output:
216+
217+
.. code-block:: javascript
218+
:copyable: false
219+
220+
[
221+
{
222+
_id: ISODate( "2022-03-18T12:50:00.000Z" ),
223+
quantity: 1,
224+
totalOrderPrice: 10
225+
}
226+
]
227+
228+
Determine if a Collection is Clustered
229+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
230+
231+
To determine if a collection is clustered, use the
232+
:dbcommand:`listCollections` command:
233+
234+
.. code-block:: javascript
235+
236+
db.runCommand( { listCollections: 1 } )
237+
238+
For clustered collections, you will see the :ref:`clusteredIndex
239+
<create.clusteredIndex>` details in the output. For example, the
240+
following output shows the details for the ``orders`` clustered
241+
collection:
242+
243+
.. code-block:: javascript
244+
:copyable: false
245+
246+
...
247+
name: 'orders',
248+
type: 'collection',
249+
options: {
250+
clusteredIndex: {
251+
v: 2,
252+
key: { _id: 1 },
253+
name: 'orders clustered key',
254+
unique: true
255+
}
256+
},
257+
...
258+
259+
``v`` is the index version.

source/core/databases-and-collections.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,4 +127,5 @@ or the :method:`db.getCollectionInfos()` method.
127127
/core/views
128128
/core/materialized-views
129129
/core/capped-collections
130+
/core/clustered-collections
130131
/core/timeseries-collections
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
Starting in MongoDB 5.3, you can create a collection with a
2+
:ref:`clustered index <db.createCollection.clusteredIndex>`. Collections
3+
created with a clustered index are called clustered collections.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
- ``"key": { _id: 1 }``, which sets the clustered index key to the
2+
``_id`` field.
3+
4+
- ``"unique": true``, which indicates the clustered index key value must
5+
be unique.
6+
7+
- |clustered-index-name|, which sets the clustered index name.
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
.. include:: /includes/clustered-collections-introduction.rst
2+
3+
See :ref:`clustered-collections`.
4+
5+
``clusteredIndex`` has the following syntax:
6+
7+
.. code-block:: javascript
8+
:copyable: false
9+
10+
clusteredIndex: {
11+
key: { <string> },
12+
unique: <boolean>,
13+
name: <string>
14+
}
15+
16+
.. list-table::
17+
:header-rows: 1
18+
19+
* - Field
20+
- Description
21+
22+
* - ``key``
23+
- Required. The clustered index key field. Must be set to ``{ _id:
24+
1 }``. The default value for the ``_id`` field is an
25+
automatically generated unique :ref:`object identifier
26+
<objectid>`, but you can set your own :ref:`clustered index key
27+
values <clustered-collections-clustered-index-key-values>`.
28+
29+
* - ``unique``
30+
- Required. Must be set to ``true``. A unique index indicates the
31+
collection will not accept inserted or updated documents where
32+
the clustered index key value matches an existing value in the
33+
index.
34+
35+
* - ``name``
36+
- Optional. A name that uniquely identifies the clustered index.
37+
38+
.. versionadded:: 5.3
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
The following :dbcommand:`create` example adds a :ref:`clustered
2+
collection <clustered-collections>` named ``products``:
3+
4+
.. code-block:: javascript
5+
6+
db.runCommand( {
7+
create: "products",
8+
clusteredIndex: { "key": { _id: 1 }, "unique": true, "name": "products clustered key" }
9+
} )
10+
11+
In the example, :ref:`clusteredIndex <create.clusteredIndex>`
12+
specifies:
13+
14+
.. |clustered-index-name| replace:: ``"name": "products clustered key"``
15+
16+
.. include:: /includes/clustered-index-example-fields.rst
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
The following :method:`db.createCollection()` example adds a
2+
:ref:`clustered collection <clustered-collections>` named ``stocks``:
3+
4+
.. code-block:: javascript
5+
6+
db.createCollection(
7+
"stocks",
8+
{ clusteredIndex: { "key": { _id: 1 }, "unique": true, "name": "stocks clustered key" } }
9+
)
10+
11+
In the example, :ref:`clusteredIndex
12+
<db.createCollection.clusteredIndex>` specifies:
13+
14+
.. |clustered-index-name| replace:: ``"name": "stocks clustered key"``
15+
16+
.. include:: /includes/clustered-index-example-fields.rst

source/indexes.txt

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -193,6 +193,13 @@ which indexes the hash of the value of a field. These indexes have a
193193
more random distribution of values along their range, but *only*
194194
support equality matches and cannot support range-based queries.
195195

196+
Clustered Indexes
197+
~~~~~~~~~~~~~~~~~
198+
199+
.. include:: /includes/clustered-collections-introduction.rst
200+
201+
See :ref:`clustered-collections`.
202+
196203
Index Properties
197204
----------------
198205

0 commit comments

Comments
 (0)