Skip to content

Commit 19966e9

Browse files
biniona-mongodbDavid Cuthbert
authored andcommitted
(DOSCP-15813) Change Streams (#138)
Co-authored-by: David Cuthbert <[email protected]>
1 parent bb785c1 commit 19966e9

File tree

4 files changed

+96
-2
lines changed

4 files changed

+96
-2
lines changed

snooty.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
name = "kafka-connector"
22
title = "MongoDB Kafka Connector"
33
intersphinx = ["https://docs.mongodb.com/manual/objects.inv"]
4-
54
toc_landing_pages = ["/sink-connector", "/sink-connector/configuration-properties", "/source-connector/configuration-properties"]
5+
6+
[constants]
7+
mkc = "MongoDB Kafka Connector"

source/sink-connector/configuration-properties/topic-override.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,5 +77,6 @@ the following for data consumed from ``topicA``:
7777
- Omit fields ``k2`` and ``k4`` from the value projection using the
7878
``BlockList`` projection type.
7979

80+
8081
For an example of how to configure the Block List Projector, see the
8182
(TODO: Sink Post-Processors guide).

source/source-connector/fundamentals/change-streams.txt

Lines changed: 89 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,92 @@
22
Change Streams
33
==============
44

5-
TODO
5+
.. default-domain:: mongodb
6+
7+
.. contents:: On this page
8+
:local:
9+
:backlinks: none
10+
:depth: 2
11+
:class: singlecol
12+
13+
Overview
14+
--------
15+
16+
In this guide, you can learn about **change streams** and how they are
17+
used in a {+mkc+} source connector.
18+
19+
Change Streams
20+
--------------
21+
22+
Change streams are a feature of MongoDB that allow you to receive real-time
23+
updates on data changes. Change streams return
24+
**change event documents**. A change event document is a document in your
25+
**oplog** that contains idempotent instructions to recreate a change that
26+
occurred in your MongoDB deployment as well as the metadata related to that
27+
change.
28+
29+
The oplog is a special collection in MongoDB that keeps track of all changes
30+
within a MongoDB replica set. Change streams help you use the change event data
31+
stored in the oplog without you having to learn details about how the oplog
32+
works.
33+
34+
.. important:: You must have a replica set or a sharded cluster
35+
36+
Change streams are available for replica sets and sharded clusters. A
37+
standalone MongoDB instance cannot produce a change stream.
38+
39+
For more information on change streams, see the following MongoDB manual entries:
40+
41+
- :manual:`Change streams </changeStreams/>`
42+
- :manual:`Oplog </core/replica-set-oplog/>`
43+
44+
Change Event Structure
45+
~~~~~~~~~~~~~~~~~~~~~~
46+
47+
You can find the complete structure of change event documents, including
48+
descriptions of all fields,
49+
:ref:`in the MongoDB manual <change-stream-output>`.
50+
51+
.. note:: The Full Document Option
52+
53+
If you want Kafka Connect to receive just the document created or modified
54+
from your change operation, use the ``publish.full.document.only=true``
55+
option. For more information see our
56+
guide on <TODO: Link to source connector properties>.
57+
58+
Source Connectors
59+
-----------------
60+
61+
A {+mkc+} source connector works by opening a single change stream with
62+
MongoDB and sending data from that change stream to Kafka Connect. Your source
63+
connector maintains its change stream for the duration of its runtime, and your
64+
connector closes its change stream when you stop it.
65+
66+
.. note:: MongoDB Java driver
67+
68+
Your source connector uses the MongoDB Java driver to create a change
69+
stream. For more information, see this guide on change streams in the MongoDB
70+
java driver. <TODO: Link to Watch For Changes Usage Example>
71+
72+
Resume Tokens
73+
~~~~~~~~~~~~~
74+
75+
Your connector stores a **resume token** to keep track of what changes
76+
it has processed. A resume token is a piece of data that references
77+
the ``_id`` field of a change event document in your MongoDB oplog.
78+
Your connector only processes relevant change event documents written to the oplog after the
79+
document referenced by its resume token.
80+
81+
If your source connector does not have a resume token, such as when you start
82+
the connector for the first time, your connector processes relevant change
83+
events written to the oplog after it first connects to MongoDB.
84+
85+
.. <TODO for Ross>: Confirm if this is after the source connector first connects
86+
to MongoDB, or after it first starts. Pretty sure it is after it first
87+
connects to MongoDB. I think this is getting in the weeds of how change streams are
88+
implemented in the Java driver and I'm not sure if this is a valuable distinction.
89+
90+
If your source connector's resume token does not correspond to any entry in your
91+
oplog, your connector has an invalid resume token. To learn how to recover from an
92+
invalid resume token,
93+
:ref:`see our troubleshooting guide <kafka-troubleshoot-recover-invalid-resume-token>`.

source/troubleshooting/recover-from-invalid-resume-token.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@ Invalid Resume Token
1010
:depth: 2
1111
:class: singlecol
1212

13+
14+
.. _kafka-troubleshoot-recover-invalid-resume-token:
15+
1316
Overview
1417
--------
1518

0 commit comments

Comments
 (0)