Skip to content

DOCSP-1147: ChangeStreams #3073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions config/sphinx_local.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ theme:
- /release-notes/3.6
- /security
- /sharding
- /changeStreams
- /core/zone-sharding
- /core/hashed-sharding
- /core/ranged-sharding
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
.. index:: changeStreams notification

=========================================
Change Streams Production Recommendations
=========================================

.. default-domain:: mongodb

.. contents:: On this page
:local:
:backlinks: none
:depth: 1
:class: singlecol

If you drop a collection with change streams opened against it, the change
stream cursors close when they advance to that point in the oplog. Change
stream cursors with the ``fullDocument : updateLookup`` option may return
``null`` for the lookup document.

Attempting to resume a change stream against a dropped collection results in
an error. Any data changes that occurred on the collection between the last
event the change stream captured and the collection drop event are lost.

Change stream response documents must adhere to the 16MB BSON document limit.
Depending on the size of documents in the collection against which you open a
change stream, notifications may fail if the resulting notification document
exceeds the 16MB limit. For example, update operations on change streams
configured to return the full updated document, or insert/replace operations
with a document that is at or just below the limit.

Replica Sets
------------

For replica sets with :term:`arbiter` members, change streams may remain
idle if enough data-bearing members are unavailable such that operations cannot
be majority committed.

For example, consider a 3-member replica set with two data-bearing nodes and
an arbiter. If the secondary goes down, such as due to failure or an upgrade,
writes cannot be majority committed. The change stream remains open, but does
not send any notifications.

In this scenario, the application can catch up to all operations that occurred
during the downtime so long as the last operation the application received is
still in the oplog of the replica set.

If significant downtime is estimated, such as for an upgrade or a significant
disaster, consider increasing the size of the oplog such that operations are
retained for a duration of time greater than the estimated downtime.
Use :method:`rs.printReplicationInfo()` to retrieve information on the
oplog status, including the size of the oplog and the time range of operations.

Sharded Clusters
----------------

Change streams provide a total ordering of changes across shards by utilizing
a global logical clock. MongoDB guarantees the order of changes are preserved
and change stream notifications can be safely interpreted in the order
received. For example, a change stream cursor opened against a 3-shard sharded
cluster returns change notifications respecting the total order of those
changes across all three shards.

To guarantee total ordering of changes, for each change notification the
:program:`mongos` checks with each shard to see if the shard has seen more
recent changes. Sharded clusters with one or more shards that have little or
no activity for the collection, or are "cold", can negatively affect the
response time of the change stream as the :program:`mongos` must still check
with those cold shards to guarantee total ordering of changes. This effect may
be more apparent with geographically distributed shards, or workloads where
the majority of operations occur on a subset of shards in the cluster.

If a sharded collection has high levels of activity, the :program:`mongos`
may not be able to keep up with the changes across all of the shards.
Consider utilizing notification filters for these types of collections.
For example, passing a :pipeline:`$match` pipeline configured to filter
only ``insert`` operations.

For sharded collections, update operations with :ref:`multi : true
<multi-parameter>` may cause any change streams opened against that collection
to send notifications for :term:`orphaned documents <orphaned document>`.

From the moment an unsharded collection is sharded until the time the change
stream catches up to the first chunk migration, the ``documentKey`` in the
change stream notification document only includes the ``_id`` of the document,
not the full shard key.
64 changes: 64 additions & 0 deletions source/changeStreams.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
.. index:: changeStreams notification

.. _changeStreams:

.. _collection_watch:

==============
Change Streams
==============

.. default-domain:: mongodb

.. contents:: On this page
:local:
:backlinks: none
:depth: 1
:class: singlecol

.. versionadded:: 3.6

.. toctree::
:titlesonly:

/tutorial/change-streams-example
/administration/change-streams-production-recommendations
/reference/change-events

Change streams allow applications to access real-time data changes without the
complexity and risk of tailing the :term:`oplog`. Applications can use change
streams to subscribe to all data changes on a :term:`collection` and
immediately react to them.

You can only open a change stream against a replica set or a :term:`sharded
cluster` with replica set :term:`shards <shard>`. For development environments,
a single-member replica set or shard replica set is sufficient to utilize
change streams.

Change streams can benefit architectures with reliant business systems,
informing downstream systems once data changes are durable. For example,
change streams can save time for developers when implementing Extract,
Transform, and Load (ETL) services, cross-platform synchronization,
collaboration functionality, and notification services.

See :doc:`/tutorial/change-streams-example` for examples of opening and
configuring change streams.

Change streams only notify on data changes that have persisted to a majority
of data-bearing members in the replica set. This ensures notifications are
triggered only by majority-committed changes that are durable in failure
scenarios.

For example, consider a 3-member :term:`replica set` with a change stream
cursor opened against the :term:`primary`. If a client issues an insert
operation, the change stream only notifies the application of the data change
once that insert has persisted to a majority of data-bearing members.

For deployments enforcing :ref:`authentication` and :ref:`authorization
<authorization>`, applications can only open change streams against
collections they have read access to.

Change streams are resumable, as long as the cluster still has enough history
in the :term:`oplog` or each shard replica set's oplog to locate the last
operation that the application received. See :ref:`change-stream-resume` for
documentation and examples on resuming change streams.
1 change: 1 addition & 0 deletions source/contents.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ project, this Manual and additional editions of this text.
/storage
/security
/replication
/changeStreams
/sharding
/faq
/reference
Expand Down
Loading