@@ -475,52 +475,76 @@ using indexes and document filters.
475
475
Indexes
476
476
~~~~~~~
477
477
478
- The :ref:`query planner <query-plans-query-optimization>` analyzes
479
- an aggregation pipeline to determine if :ref:`indexes <indexes>`
480
- can be used to improve pipeline performance.
478
+ An aggregation pipeline can use :ref:`indexes <indexes>` from the input
479
+ collection to improve performance. Using an index limits the amount of
480
+ documents a stage processes. Ideally, an index can :ref:`cover
481
+ <read-operations-covered-query>` the stage query. A covered query has
482
+ especiallly high performance, since the index returns all matching
483
+ documents.
481
484
482
- The following list shows some pipeline stages that can use indexes:
485
+ For example, a pipeline that consists of :pipeline:`$match`,
486
+ :pipeline:`$sort`, :pipeline:`$group` can benefit from indexes at
487
+ every stage:
488
+
489
+ - An index on the :pipeline:`$match` query field can efficiently
490
+ identify the relevant data
491
+
492
+ - An index on the sorting field can return data in sorted order for the
493
+ :pipeline:`$sort` stage
494
+
495
+ - An index on the grouping field that matches the :pipeline:`$sort`
496
+ order can return all of the field values needed to execute the
497
+ :pipeline:`$group` stage (a covered query)
498
+
499
+ To determine whether a pipeline uses indexes, review the query plan and
500
+ look for ``IXSCAN`` or ``DISTINCT_SCAN`` plans.
501
+
502
+ .. note::
503
+ In some cases, the query planner uses a ``DISTINCT_SCAN`` index plan
504
+ that returns one document per index key value. ``DISTINCT_SCAN``
505
+ executes faster than ``IXSCAN`` if there are multiple documents per
506
+ key value. However, index scan parameters might affect the time
507
+ comparison of ``DISTINCT_SCAN`` and ``IXSCAN``.
508
+
509
+ For early stages in your aggregation pipeline, consider indexing the
510
+ query fields. Stages that can benefit from indexes are:
483
511
484
512
``$match`` stage
485
- :pipeline:`$match` can use an index to filter documents if
486
- :pipeline:`$match` is the first stage in a pipeline.
513
+ :pipeline:`$match` can use an index to filter documents if it is the
514
+ first stage in the pipeline, after any optimizations from the
515
+ :ref:`query planner <query-plans-query-optimization>`.
487
516
488
517
``$sort`` stage
489
- :pipeline:`$sort` can use an index if :pipeline:`$sort` is not
490
- preceded by a :pipeline:`$project`, :pipeline:`$unwind`, or
491
- :pipeline:`$group` stage.
518
+ :pipeline:`$sort` can benefit from an index as long as it is not
519
+ preceded by a :pipeline:`$project`, :pipeline:`$unwind`, or
520
+ :pipeline:`$group` stage.
492
521
493
522
``$group`` stage
494
- :pipeline:`$group` can potentially use an index to find the first
495
- document in each group if:
523
+ :pipeline:`$group` can use an index to find the first document in
524
+ each group if it meets all of the following conditions :
496
525
497
- - :pipeline:`$group` is preceded by :pipeline:`$sort` that sorts the
498
- field to group by, and
526
+ - a :pipeline:`$sort` stage sorts the grouping field before
527
+ :pipeline:`$ group`
499
528
500
- - there is an index on the grouped field that matches the sort order,
501
- and
529
+ - an index exists that matches the sort order on the grouped field
502
530
503
- - :group:`$first` is the only accumulator in :pipeline:`$group`.
531
+ - :group:`$first` is the only accumulator in the :pipeline:`$group`
532
+ stage
504
533
505
- See :ref:`group-pipeline-optimization` for an example.
534
+ See :ref:`$group Performance Optimizations <group-pipeline-optimization>`
535
+ for an example.
506
536
507
- ``$geoNear`` stage
508
- :pipeline:`$geoNear` can use a geospatial index. :pipeline:`$geoNear`
509
- must be the first stage in an aggregation pipeline .
537
+ ``$geoNear`` stage
538
+ :pipeline:`$geoNear` always uses an index, since it must be the first
539
+ stage in a pipeline and requires a :ref:`geospatial index <index-feature-geospatial>` .
510
540
511
- Starting in MongoDB 4.2, in some cases, an aggregation pipeline can use
512
- a ``DISTINCT_SCAN`` index plan that returns one document per index key
513
- value.
514
-
515
- .. note::
516
- ``DISTINCT_SCAN`` executes faster than ``IXSCAN`` if multiple
517
- documents per index value exist. However, index scan parameters
518
- might affect the time comparison of ``DISTINCT_SCAN`` and
519
- ``IXSCAN``.
541
+ Additionally, stages later in the pipeline that retrieve data from
542
+ other, unmodified collections can use indexes on those collections
543
+ for optimization. These stages include:
520
544
521
- Indexes can :ref:`cover <read-operations-covered-query>` queries in an
522
- aggregation pipeline. A covered query uses an index to return all of the
523
- documents and has high performance.
545
+ - :pipeline:`$lookup`
546
+ - : pipeline:`$graphLookup`
547
+ - :pipeline:`$unionWith`
524
548
525
549
Document Filters
526
550
~~~~~~~~~~~~~~~~
@@ -576,4 +600,4 @@ MongoDB increases the :pipeline:`$limit` amount with the reordering.
576
600
.. seealso::
577
601
578
602
:method:`explain <db.collection.aggregate()>` option in the
579
- :method:`db.collection.aggregate()`
603
+ :method:`db.collection.aggregate()`
0 commit comments