Skip to content

Commit 22a5f82

Browse files
biniona-mongodbkyuan-mongodbRWaltersMAccho-mongodb
authored andcommitted
Docsp 18413 tutorial cdc handler (#143)
Co-authored-by: Kailie Yuan <[email protected]> Co-authored-by: Robert Walters <[email protected]> Co-authored-by: Chris Cho <[email protected]>
1 parent 2b0aa01 commit 22a5f82

File tree

9 files changed

+365
-40
lines changed

9 files changed

+365
-40
lines changed
827 KB
Loading
782 KB
Loading
Lines changed: 284 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,284 @@
1+
title: Download Source Files
2+
ref: cdc-tutorial-download-files
3+
content: |
4+
.. include:: /includes/tutorials/get-pipeline.rst
5+
6+
Once you have cloned the repository, switch to the
7+
``cdc-tutorial`` branch with the following command
8+
9+
.. code-block:: bash
10+
11+
git checkout cdc-tutorial
12+
13+
---
14+
title: Set Up the Environment
15+
ref: cdc-tutorial-set-up-environment
16+
content: |
17+
The sample pipeline consists of the following tools running in Docker
18+
containers on your computer:
19+
20+
- A MongoDB replica set
21+
- An Apache Kafka instance
22+
- A Kafka Connect instance with the MongoDB Kafka Connector installed
23+
- A Zookeeper instance (Zookeeper is a dependency of Apache Kafka)
24+
25+
The pipeline comes with a source connector already installed. The source
26+
connector writes change event documents corresponding to the ``Source``
27+
collection in the ``CDCTutorial`` database to a Kafka topic. The configuration
28+
for the source connector is as follows:
29+
<TODO: Link to Source - Fundamentals - Change Streams page>
30+
31+
.. code-block:: properties
32+
:copyable: false
33+
34+
name="mongo-source-CDCTutorial-eventroundtrip"
35+
connector.class="com.mongodb.kafka.connect.MongoSourceConnector"
36+
connection.uri="mongodb://mongo1:27017,mongo2:27017,mongo3:27017"
37+
database="CDCTutorial"
38+
collection="Source"
39+
40+
To download and start the pipeline, execute the following command from
41+
within the root directory of your cloned repository:
42+
43+
.. code-block:: bash
44+
45+
docker-compose -p cdc-tutorial up -d
46+
47+
.. include:: /includes/tutorials/download-note.rst
48+
49+
Once the preceding command finishes and the pipeline starts, you should see
50+
output that looks like this:
51+
52+
.. code-block:: text
53+
:copyable: false
54+
55+
...
56+
Creating mongo1 ... done
57+
Creating mongo1 ... done
58+
Creating zookeeper ... done
59+
Creating broker ... done
60+
Creating mongo1-setup ... done
61+
Creating connect ... done
62+
Creating shell ... done
63+
64+
Open a second terminal window. You will use one
65+
terminal to monitor your topic, and the other terminal to perform write
66+
operations on your database. Enter the following command into both terminals:
67+
68+
.. code-block:: bash
69+
70+
docker exec -it shell /bin/bash
71+
72+
Once you have entered the preceding command into both terminal windows, your
73+
terminals should look like:
74+
75+
.. figure:: /includes/figures/two-shells-after.png
76+
:alt: Arrangement of two shells for this tutorial.
77+
78+
Arrange your two terminal windows to match the preceding image so that
79+
both are visible and one is above the other.
80+
81+
To monitor your topic, type the following command in your upper terminal window:
82+
83+
.. code-block:: bash
84+
85+
kafkacat -b broker:29092 -C -t CDCTutorial.Source
86+
87+
.. important:: Broker Leader Not Available
88+
89+
If you receive the following output, run the preceding ``kafkacat`` command
90+
a second time:
91+
92+
.. code-block::
93+
:copyable: false
94+
95+
% Error: Topic CDCTutorial.Source error: Broker: Leader not available
96+
97+
Once you enter the preceding command, you should see output that looks like
98+
this:
99+
100+
.. code-block:: text
101+
:copyable: false
102+
103+
% Reached end of topic CDCTutorial.Source [0] at offset 0
104+
105+
Your upper terminal window is now listening to the ``CDCTutorial.Source`` Kafka
106+
topic. Changes to your topic will print in this terminal window.
107+
108+
---
109+
title: Configure Sink Connector
110+
ref: cdc-tutorial-configure-sink
111+
content: |
112+
To configure your sink connector, execute the following command in your lower
113+
terminal window:
114+
115+
.. code-block:: bash
116+
117+
curl -X POST -H "Content-Type: application/json" --data '
118+
{ "name": "mongo-sink-CDCTutorial-eventroundtrip",
119+
"config": {
120+
"connector.class":"com.mongodb.kafka.connect.MongoSinkConnector",
121+
"tasks.max":"1",
122+
"topics":"CDCTutorial.Source",
123+
"change.data.capture.handler":"com.mongodb.kafka.connect.sink.cdc.mongodb.ChangeStreamHandler",
124+
"connection.uri":"mongodb://mongo1:27017,mongo2:27017,mongo3:27017",
125+
"database":"CDCTutorial",
126+
"collection":"Destination"}
127+
}' http://connect:8083/connectors -w "\n" | jq .
128+
129+
After you run the preceding command, you should see the following output:
130+
131+
.. code-block:: json
132+
133+
...
134+
{
135+
"name": "mongo-sink-CDCTutorial-eventroundtrip",
136+
"config": {
137+
"connector.class": "com.mongodb.kafka.connect.MongoSinkConnector",
138+
"tasks.max": "1",
139+
"topics": "CDCTutorial.Source",
140+
"change.data.capture.handler": "com.mongodb.kafka.connect.sink.cdc.mongodb.ChangeStreamHandler",
141+
"connection.uri": "mongodb://mongo1:27017,mongo2:27017,mongo3:27017",
142+
"database": "CDCTutorial",
143+
"collection": "Destination",
144+
"name": "mongo-sink-CDCTutorial-eventroundtrip"
145+
},
146+
"tasks": [],
147+
"type": "sink"
148+
}
149+
150+
This configuration makes your sink connector do the following things:
151+
152+
- Listen for events on the ``CDCTutorial.Source`` topic
153+
- Apply a change data capture handler to documents it receives from the
154+
``Tutorial.Source`` topic
155+
- Write received documents to the ``Destination`` collection in the
156+
``CDCTutorial`` database
157+
158+
---
159+
title: Change Data in MongoDB
160+
ref: cdc-tutorial-change-data
161+
content: |
162+
From your lower terminal, enter the
163+
`MongoDB Shell <https://docs.mongodb.com/mongodb-shell/>`__
164+
with the following command:
165+
166+
.. code-block:: bash
167+
168+
mongosh mongodb://mongo1:27017/?replicaSet=rs0
169+
170+
Once you are in the MongoDB Shell, your terminal prompt should look like this:
171+
172+
.. code-block:: bash
173+
:copyable: false
174+
175+
rs0 [primary] test>
176+
177+
Insert a document into the ``Source`` collection of the ``CDCTutorial`` database
178+
with the following commands:
179+
180+
.. code-block :: javascript
181+
182+
use CDCTutorial
183+
db.Source.insert({proclaim: "Hello World!"});
184+
185+
Once you insert the document, you should see output that resembles the following
186+
in your upper shell:
187+
188+
.. code-block:: bash
189+
:copyable: false
190+
191+
{"schema":{"type":"string","optional":false},
192+
"payload":{"_id": {"_data": "8260...4"},
193+
"operationType": "insert",
194+
"clusterTime": {"$timestamp": {"t": 1611348141, "i": 2}},
195+
"fullDocument": {"_id": {"$oid": "600b38ad..."}, "proclaim": "Hello World!"},
196+
"ns": {"db": "CDCTutorial", "coll": "Source"},
197+
"documentKey": {"_id": {"$oid": "600b38a...."}}}}
198+
199+
In your lower shell, inspect the ``Destination`` collection with the following
200+
command:
201+
202+
.. code-block:: javascript
203+
204+
db.Destination.find()
205+
206+
You should see output that looks like this:
207+
208+
.. code-block:: bash
209+
:copyable: false
210+
211+
{ _id: ..., proclaim: 'Hello World!' }
212+
213+
Try deleting your document from your ``Source`` collection with the following
214+
command:
215+
216+
.. code-block:: javascript
217+
218+
db.Source.deleteMany({})
219+
220+
Once you delete the document, you should see output that resembles the following
221+
in your upper shell:
222+
223+
.. code-block:: bash
224+
:copyable: false
225+
226+
{"schema":{"type":"string","optional":false},"payload":"{\"_id\":
227+
{\"_data\": \"826138BCBA000000012B022C0100296E5A10041FD232D9ECE347FFABA837E9AB05D95046645F696400646138BCAF2A52D9E0D299336F0004\"},
228+
\"operationType\": \"delete\", \"clusterTime\": {\"$timestamp\": {\"t\":
229+
1631108282, \"i\": 1}}, \"ns\": {\"db\": \"CDCTutorial\", \"coll\":
230+
\"Source\"}, \"documentKey\": {\"_id\": {\"$oid\":
231+
\"6138bcaf2a52d9e0d299336f\"}}}"}
232+
233+
Now see how many documents are in your ``Destination`` collection:
234+
235+
.. code-block:: javascript
236+
237+
db.Destination.count()
238+
239+
You should see the following output:
240+
241+
.. code-block:: text
242+
:copyable: false
243+
244+
0
245+
246+
Once you have finished exploring the connector in the MongoDB shell, you can
247+
exit the MongoDB shell with the following command:
248+
249+
.. code-block:: bash
250+
251+
exit
252+
253+
Try and explore the CDC handler on your own. Here are some challenges to get
254+
you started:
255+
256+
- Add a second source connector that writes to the ``CDCTutorial.Source``
257+
topic. Use a pipeline to have this connector only write insert events.
258+
- Remove the ``change.data.capture.handler`` from your sink connector. What
259+
do your documents look like?
260+
- Use ``kafkacat`` to upload a message to the ``CDCTutorial.Source`` topic
261+
that isn't a MongoDB Change Event document. What happens?
262+
263+
<TODO: Link to relevant docs sections from the explore list above >
264+
265+
---
266+
title: Stop the Pipeline
267+
ref: cdc-tutorial-stop-pipeline
268+
content: |
269+
To conserve resources on your computer, make sure to stop the sample pipeline
270+
once you are done exploring this example.
271+
272+
Before you stop the sample pipeline, make sure to exit your Docker shell.
273+
You can exit your Docker shell by running the following command in your lower
274+
terminal:
275+
276+
.. code-block:: bash
277+
278+
exit
279+
280+
To stop the sample pipeline and remove containers and images, run the following command:
281+
282+
.. code-block:: bash
283+
284+
docker-compose -p cdc-tutorial down --rmi 'all'
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
.. note:: How long does the download take?
2+
3+
In total, the Docker images for this tutorial require about 2.4 GB of space.
4+
The following list shows how long it takes to download the images with
5+
different internet speeds:
6+
7+
- 40 megabits per second: 8 minutes
8+
- 20 megabits per second: 16 minutes
9+
- 10 megabits per second: 32 minutes
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
We provide you with a sample data pipeline so that you can use the MongoDB
2+
Kafka Connector. To access and use the files that define your sample data pipeline,
3+
clone the tutorial repository and change your directory to the root of the
4+
repository using the following commands:
5+
6+
<TODO: decide on final location for this example>
7+
8+
.. code-block:: bash
9+
10+
git clone https://github.com/biniona-mongodb/MongoKafkaLite
11+
cd MongoKafkaLite
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
This guide uses the following tools:
2+
3+
- `Docker Platform <https://docs.docker.com/get-docker/>`__
4+
- `Git <https://git-scm.com/book/en/v2/Getting-Started-Installing-Git>`__
5+
6+
If you do not have any of these tools installed on your computer, you
7+
can install a tool by clicking on the tool's name and following the linked
8+
installation instructions.
9+
10+
.. tip:: Read the Docker Documentation
11+
12+
This guide uses some Docker specific terminology. If you are new to Docker
13+
and would like a comprehensive introduction, read through Docker's official
14+
`Get Started Guide <https://docs.docker.com/get-started/>`__.

source/quick-start.txt

Lines changed: 3 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -27,41 +27,12 @@ introductory page on :doc:`Kafka and Kafka Connect <introduction/kafka-connect>`
2727
Requirements
2828
------------
2929

30-
This guide uses the following tools:
31-
32-
- `Docker Platform <https://docs.docker.com/get-docker/>`__ *required*
33-
- `Git <https://git-scm.com/book/en/v2/Getting-Started-Installing-Git>`__ *optional*
34-
35-
If you do not have any of these tools installed on your computer, you
36-
can install a tool by clicking on the tool's name and following the linked
37-
installation instructions.
38-
39-
.. tip:: Read the Docker Documentation
40-
41-
This guide uses some Docker specific terminology. If you are new to Docker
42-
and would like a comprehensive introduction, read through Docker's official
43-
`Get Started Guide <https://docs.docker.com/get-started/>`__.
30+
.. include:: /includes/tutorials/pipeline-requirements.rst
4431

4532
Sample Pipeline
4633
~~~~~~~~~~~~~~~
4734

48-
In this guide you receive a sample data pipeline so that you can use the MongoDB
49-
Kafka Connector. To access and use the files that define your sample data pipeline,
50-
clone the quick start repository and change your directory to the root of the
51-
repository using the following commands:
52-
53-
<TODO: decide on final location for this example>
54-
55-
.. code-block:: bash
56-
57-
git clone https://github.com/biniona-mongodb/MongoKafkaLite
58-
cd MongoKafkaLite
59-
60-
.. note:: Download as a ZIP
61-
62-
If you would rather download the repository that defines the pipeline as a
63-
ZIP file, you can do that by
64-
`clicking this link <https://github.com/biniona-mongodb/MongoKafkaLite/archive/refs/heads/main.zip>`__.
35+
.. include:: /includes/tutorials/get-pipeline.rst
6536

6637
Start the Pipeline
6738
------------------
@@ -90,15 +61,7 @@ the root of the quick start repository:
9061

9162
docker-compose -p quickstart up -d
9263

93-
.. note:: How long does the download take?
94-
95-
In total, the Docker images for the quick start require about 2.4 GB of space.
96-
The following list shows how long it takes to download the images with
97-
different internet speeds:
98-
99-
- 40 megabits per second: 8 minutes
100-
- 20 megabits per second: 16 minutes
101-
- 10 megabits per second: 32 minutes
64+
.. include:: /includes/tutorials/download-note.rst
10265

10366
Set Up Connectors
10467
-----------------

source/tutorials.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,5 +11,6 @@ Tutorials
1111
Use Built-In and Custom Write Strategies </tutorials/write-strategies>
1212
Use Built-In and Custom Post-processors </tutorials/post-processors>
1313
Handle Errors in the Sink and Source Connectors </tutorials/handle-errors>
14+
Replicate data with the Change Data Capture Handler </tutorials/replicate-with-cdc>
1415

1516
asdf

0 commit comments

Comments
 (0)