Skip to content

Commit a92743a

Browse files
DOCSP-28718 28723 Time Series Granularity procedures and practices (#3061)
* Initital stashed commit * Codeblock rendering fix * Self proofreading * Clarified restrictions on bucketing parameters * Switching to procedure directive * Switching to procedure directive * Internal PR feedback * Internal review feedback * Internal review feedback--amend * Internal review feedback--amend * Internal review feedback--amend * External review feedback
1 parent 27fa16e commit a92743a

File tree

4 files changed

+273
-177
lines changed

4 files changed

+273
-177
lines changed

source/core/timeseries/timeseries-best-practices.txt

Lines changed: 42 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -213,7 +213,45 @@ ratio.
213213
Optimize Query Performance
214214
--------------------------
215215

216-
To improve query performance,
217-
:ref:`create one or more secondary indexes <timeseries-add-secondary-index>`
218-
on your ``timeField`` and ``metaField`` to support common query
219-
patterns.
216+
Set Appropriate Bucket Granularity
217+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218+
When you create a time series collection, MongoDB groups incoming time
219+
series data into buckets. By accurately setting granularity, you control
220+
how frequently data is bucketed based on the ingestion rate of your data.
221+
222+
Starting in MongoDB 6.3, you can use the custom bucketing parameters
223+
``bucketMaxSpanSeconds`` and ``bucketRoundingSeconds`` to specify bucket
224+
boundaries and more precisely control how time series data is bucketed.
225+
226+
You can improve performance by setting the ``granularity`` or custom
227+
bucketing parameters to the best match for the time span between
228+
incoming measurements from the same data source. For example, if you are
229+
recording weather data from thousands of sensors but only record data
230+
from each sensor once per 5 minutes, you can either set ``granularity``
231+
to ``"minutes"`` or set the custom bucketing parameters to ``300``
232+
(seconds).
233+
234+
In this case, setting the ``granularity`` to ``hours`` groups up to a
235+
month's worth of data ingest events into a single bucket, resulting in
236+
longer traversal times and slower queries. Setting it to ``seconds``
237+
leads to multiple buckets per polling interval, many of which
238+
might contain only a single document.
239+
240+
The following table shows the maximum time interval included in one
241+
bucket of data when using a given ``granularity`` value:
242+
243+
.. include:: /includes/table-timeseries-granularity-intervals.rst
244+
245+
.. seealso::
246+
247+
:ref:`Timing of Automatic Removal
248+
<timeseries-collection-delete-operations-timing>`
249+
250+
Create Secondary Indexes
251+
~~~~~~~~~~~~~~~~~~~~~~~~
252+
253+
To improve query performance, :ref:`create one or more secondary indexes
254+
<timeseries-add-secondary-index>` on your ``timeField`` and
255+
``metaField`` to support common query patterns. In versions 6.3 and
256+
higher, MongoDB creates a secondary index on the ``timeField`` and
257+
``metaField`` automatically.

source/core/timeseries/timeseries-granularity.txt

Lines changed: 104 additions & 120 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,15 @@ Set Granularity for Time Series Data
1616
:description: Time series, granularity, IOT
1717
:keywords: Time series, granularity, IOT
1818

19-
When you create a :ref:`time series collection
20-
<manual-timeseries-collection>`, MongoDB automatically creates a ``system.buckets``
21-
:ref:`system collection <metadata-system-collections>` based on the ``granularity`` value you specify, and groups
22-
documents into those buckets.
19+
When you create a time series collection, MongoDB automatically creates
20+
a ``system.buckets`` :ref:`system collection
21+
<metadata-system-collections>` and groups incoming time series data
22+
into buckets. By setting granularity, you control how
23+
frequently data is bucketed based on the ingestion rate of your data.
24+
25+
Starting in MongoDB 6.3, you can use the custom bucketing parameters
26+
``bucketMaxSpanSeconds`` and ``bucketRoundingSeconds`` to specify bucket
27+
boundaries and more accurately control how time series data is bucketed.
2328

2429
.. note::
2530

@@ -28,83 +33,18 @@ documents into those buckets.
2833
created. See :ref:`MongoDB 5.0 known issues
2934
<5.0-known-issue-granularity>`.
3035

31-
When you create a :ref:`time series collection
32-
<manual-timeseries-collection>`, you can specify the
33-
``granularity`` parameter. The ``granularity`` determines how data
34-
in the time series collection is stored internally. For details about
35-
how MongoDB stores time series data, see :ref:`flexible-bucketing`.
36-
37-
By default, ``granularity`` is set to ``"seconds"``. To optimize
38-
data storage, set the ``granularity`` to a value that is closest
39-
to the ingestion rate for your data source as specified by
40-
the ``metaField`` field. The ingestion rate is the time span between
41-
consecutive incoming measurements that have the same unique value for
42-
the ``metaField``.
43-
44-
Consider the following example:
45-
46-
.. code-block:: javascript
47-
48-
db.createCollection(
49-
"weather24h",
50-
{
51-
timeseries: {
52-
timeField: "timestamp",
53-
metaField: "metadata",
54-
granularity: "minutes"
55-
},
56-
expireAfterSeconds: 86400
57-
}
58-
)
59-
60-
If your ``metaField`` data identifies weather sensors and
61-
you ingest data from each individual sensor once every 5 minutes, you
62-
should choose ``"minutes"``. Even if you have thousands of sensors and
63-
the data coming in from different sensors is only seconds apart, the
64-
``granularity`` should still be based on the ingestion rate for one
65-
sensor that is uniquely identified by its metadata.
66-
67-
In the following table, you can see the max time span of data that is
68-
stored together for each ``granularity`` value:
69-
70-
.. list-table::
71-
:header-rows: 1
72-
:widths: 40 60
73-
74-
* - ``granularity``
36+
Retrieve the Current Bucketing Parameters
37+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
7538

76-
- Covered Time Span
77-
78-
* - ``"seconds"`` (default)
79-
80-
- one hour
81-
82-
* - ``"minutes"``
83-
84-
- 24 hours
85-
86-
* - ``"hours"``
87-
88-
- 30 days
89-
90-
91-
.. seealso::
92-
93-
:ref:`Timing of Automatic Removal
94-
<timeseries-collection-delete-operations-timing>`
95-
96-
Retrieve the ``granularity`` of a Time Series Collection
97-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
98-
99-
To retrieve the current value of ``granularity``, use the
39+
To retrieve current collection values, use the
10040
:dbcommand:`listCollections` command:
10141

10242
.. code-block:: javascript
10343

10444
db.runCommand( { listCollections: 1 } )
10545

106-
The result document contains a document for the time series collection
107-
which contains the ``options.timeseries.granularity`` field.
46+
For time series collections, the output contains
47+
``granularity``, ``bucketMaxSpanSeconds``, and ``bucketRoundingSeconds`` fields, if present.
10848

10949
.. code-block:: javascript
11050
:copyable: false
@@ -123,7 +63,8 @@ which contains the ``options.timeseries.granularity`` field.
12363
timeField: <string>,
12464
metaField: <string>,
12565
granularity: <string>,
126-
bucketMaxSpanSeconds: <number>
66+
bucketMaxSpanSeconds: <number>,
67+
bucketRoundingSeconds: <number>
12768
}
12869
},
12970
...
@@ -133,55 +74,70 @@ which contains the ``options.timeseries.granularity`` field.
13374
}
13475
}
13576

136-
Change the ``granularity`` of a Time Series Collection
137-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
13877

139-
To change the ``granularity`` value, issue the following
140-
:dbcommand:`collMod` command:
78+
Using the "granularity" Field
79+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
14180

142-
.. code-block:: javascript
81+
The following table shows the maximum time interval included in one
82+
bucket of data when using a given ``granularity`` value:
14383

144-
db.runCommand({
145-
collMod: "weather24h",
146-
timeseries: { granularity: "hours" }
147-
})
84+
.. include:: /includes/table-timeseries-granularity-intervals.rst
14885

149-
Once set, you cannot decrease granularity. For example, if granularity
150-
is set to ``minutes``, you can increase it to ``hours``, but you cannot
151-
decrease it to ``seconds``.
86+
By default, ``granularity`` is set to ``seconds``. You can improve performance by setting the ``granularity`` value to the
87+
closest match to the time span between incoming measurements from the
88+
same data source. For example, if you are recording weather data from
89+
thousands of sensors but only record data from each sensor once per 5
90+
minutes, set ``granularity`` to ``"minutes"``.
15291

153-
.. note::
92+
.. code-block:: javascript
15493

155-
You cannot modify the ``granularity`` of a sharded time series
156-
collection.
94+
db.createCollection(
95+
"weather24h",
96+
{
97+
timeseries: {
98+
timeField: "timestamp",
99+
metaField: "metadata",
100+
granularity: "minutes"
101+
},
102+
expireAfterSeconds: 86400
103+
}
104+
)
157105

158-
.. _flexible-bucketing:
106+
Setting the ``granularity`` to ``hours`` groups up to a month's
107+
worth of data ingest events into a single bucket, resulting in longer
108+
traversal times and slower queries. Setting it to ``seconds``
109+
leads to multiple buckets per polling interval, many of which
110+
might contain only a single document.
159111

160-
Flexible Bucketing
161-
~~~~~~~~~~~~~~~~~~
112+
.. seealso::
162113

163-
MongoDB groups incoming time series data into buckets. By setting the
164-
``granularity`` parameter, you control how frequently data is bucketed
165-
based on the ingestion rate of your data.
114+
:ref:`Timing of Automatic Removal
115+
<timeseries-collection-delete-operations-timing>`
116+
117+
.. _flexible-bucketing:
166118

167-
Starting in MongoDB 6.3, you can specify the bucket boundaries to
168-
more accurately control how time series data is bucketed. Use the
169-
following parameters:
119+
Using Custom Bucketing Parameters
120+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
170121

171-
- ``bucketMaxSpanSeconds``, the maximum time span between measurements
172-
in a bucket.
173-
- ``bucketRoundingSeconds``, the time interval that determines the
174-
starting timestamp for a new bucket.
122+
In MongoDB 6.3 and higher, instead of ``granularity``, you can set
123+
bucket boundaries manually using the two custom bucketing parameters.
124+
Consider this approach if you need the additional precision to optimize
125+
a high volume of queries and :dbcommand:`insert` operations.
175126

176-
.. note::
127+
To use custom bucketing parameters, set both parameters to the same
128+
value, and do not set ``granularity``:
177129

178-
If you want to specify the bucket boundaries:
130+
- ``bucketMaxSpanSeconds`` sets the maximum time between timestamps
131+
in the same bucket. Possible values are 1-31536000.
179132

180-
- You must set both ``bucketMaxSpanSeconds`` and ``bucketRoundingSeconds``.
181-
- Both parameters must have the same value.
182-
- You can't additionally set the ``granularity`` parameter.
133+
- ``bucketRoundingSeconds`` sets the time interval that determines the
134+
starting timestamp for a new bucket. When a document requires a new
135+
bucket, MongoDB rounds down the document's timestamp value by this
136+
interval to set the minimum time for the bucket.
183137

184-
Consider the following time series collection:
138+
For the weather station example with 5 minute sensor intervals, you
139+
could fine tune bucketing by setting the custom bucketing parameters to
140+
300 seconds, instead of using a ``granularity`` of ``"minutes"``:
185141

186142
.. code-block:: javascript
187143

@@ -191,20 +147,48 @@ Consider the following time series collection:
191147
timeseries: {
192148
timeField: "timestamp",
193149
metaField: "metadata",
194-
bucketMaxSpanSeconds: 60,
195-
bucketRoundingSeconds: 60
150+
bucketMaxSpanSeconds: 300,
151+
bucketRoundingSeconds: 300
196152
}
197153
}
198154
)
199155

200-
Each bucket has a maximum time span of 60 seconds. When MongoDB opens a
201-
new bucket, the starting timestamp is rounded down to the nearest 60-second
202-
interval.
156+
If a document with a time of ``2023-03-27T18:24:35Z`` does not fit an
157+
existing bucket, MongoDB creates a new bucket with a minimum time of
158+
``2023-03-27T18:20:00Z`` and a maximum time of ``2023-03-27T18:24:59Z``.
203159

204-
For example, if you insert a new measurement into the collection with a
205-
timestamp of ``11:59:59``, the measurement is added to the bucket
206-
with boundaries between ``11:59:00`` and ``12:00:00`` (non-inclusive).
207-
If the bucket doesn't exist, the starting timestamp of the new bucket
208-
is determined by rounding ``11:59:59`` down to ``11:59:00``.
160+
Change Time Series Granularity
161+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
162+
163+
You can increase ``timeseries.granularity`` from a shorter unit of time
164+
to a longer one using a :dbcommand:`collMod` command.
165+
166+
.. code-block:: javascript
167+
168+
db.runCommand({
169+
collMod: "weather24h",
170+
timeseries: { granularity: "seconds" || "minutes" || "hours" }
171+
})
172+
173+
If you are using the custom bucketing parameters
174+
``bucketRoundingSeconds`` and ``bucketMaxSpanSeconds`` instead of
175+
``granularity``, include both custom parameters in the ``collMod``
176+
command and set them to the same value:
177+
178+
.. code-block:: javascript
179+
180+
db.runCommand({
181+
collMod: "weather24h",
182+
timeseries: {
183+
bucketRoundingSeconds: "86400",
184+
bucketMaxSpanSeconds: "86400"
185+
}
186+
})
187+
188+
You cannot decrease the granularity interval or the custom bucketing
189+
values.
190+
191+
.. note::
209192

210-
For more information on time series parameters, see :ref:`time-series-fields`.
193+
To modify the granularity of a **sharded** time series collection,
194+
you must be running MongoDB 6.0 or later.

source/core/timeseries/timeseries-limitations.txt

Lines changed: 9 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -154,24 +154,13 @@ parameters later.
154154

155155
.. _timeseries-limitations-granularity:
156156

157-
Modification of ``granularity``
158-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
159-
160-
After you set the ``granularity``, you can only increase it one level at
161-
a time. The ``granularity`` can change from ``"seconds"`` to
162-
``"minutes"`` or from ``"minutes"`` to ``"hours"``. Other changes are
163-
not allowed.
164-
165-
To change the ``granularity`` from ``"seconds"`` to ``"hours"``, first
166-
increase the ``granularity`` to ``"minutes"`` and then to ``"hours"``.
167-
168-
Modification of Bucket Parameters
169-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
157+
Modifying Bucket Parameters
158+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
170159

171-
Once you set a collection's ``bucketMaxSpanSeconds`` and
172-
``bucketRoundingSeconds`` parameters, they can only be increased. Use
173-
the :dbcommand:`collMod` command to modify the ``bucketMaxSpanSeconds``
174-
and ``bucketRoundingSeconds`` parameters. For example:
160+
Once you set a collection's ``granularity`` or custom bucketing
161+
parameters ``bucketMaxSpanSeconds`` and ``bucketRoundingSeconds``, you
162+
can increase them, but not decrease them.
163+
Use the :dbcommand:`collMod` command to modify the parameters. For example:
175164

176165
.. code-block:: javascript
177166

@@ -182,8 +171,9 @@ and ``bucketRoundingSeconds`` parameters. For example:
182171

183172
.. note::
184173

185-
The ``bucketMaxSpanSeconds`` and ``bucketRoundingSeconds`` parameters must be
186-
equal. If you modify one parameter, you must also modify the other.
174+
``bucketMaxSpanSeconds`` and ``bucketRoundingSeconds`` must be
175+
equal. If you modify one parameter, you must also set the other to
176+
the same value.
187177

188178
.. _time-series-limitations-sharding:
189179

0 commit comments

Comments
 (0)