|
| 1 | +title: Download Source Files |
| 2 | +ref: cdc-tutorial-download-files |
| 3 | +content: | |
| 4 | + .. include:: /includes/tutorials/get-pipeline.rst |
| 5 | +
|
| 6 | + Once you have cloned the repository, switch to the |
| 7 | + ``cdc-tutorial`` branch with the following command |
| 8 | +
|
| 9 | + .. code-block:: bash |
| 10 | +
|
| 11 | + git checkout cdc-tutorial |
| 12 | +
|
| 13 | +--- |
| 14 | +title: Set Up the Environment |
| 15 | +ref: cdc-tutorial-set-up-environment |
| 16 | +content: | |
| 17 | + The sample pipeline consists of the following tools running in Docker |
| 18 | + containers on your computer: |
| 19 | +
|
| 20 | + - A MongoDB replica set |
| 21 | + - An Apache Kafka instance |
| 22 | + - A Kafka Connect instance with the MongoDB Kafka Connector installed |
| 23 | + - A Zookeeper instance (Zookeeper is a dependency of Apache Kafka) |
| 24 | +
|
| 25 | + The pipeline comes with a source connector already installed. The source |
| 26 | + connector writes change event documents corresponding to the ``Source`` |
| 27 | + collection in the ``CDCTutorial`` database to a Kafka topic. The configuration |
| 28 | + for the source connector is as follows: |
| 29 | + <TODO: Link to Source - Fundamentals - Change Streams page> |
| 30 | + |
| 31 | + .. code-block:: properties |
| 32 | + :copyable: false |
| 33 | +
|
| 34 | + name="mongo-source-CDCTutorial-eventroundtrip" |
| 35 | + connector.class="com.mongodb.kafka.connect.MongoSourceConnector" |
| 36 | + connection.uri="mongodb://mongo1:27017,mongo2:27017,mongo3:27017" |
| 37 | + database="CDCTutorial" |
| 38 | + collection="Source" |
| 39 | +
|
| 40 | + To download and start the pipeline, execute the following command from |
| 41 | + within the root directory of your cloned repository: |
| 42 | +
|
| 43 | + .. code-block:: bash |
| 44 | +
|
| 45 | + docker-compose -p cdc-tutorial up -d |
| 46 | +
|
| 47 | + .. include:: /includes/tutorials/download-note.rst |
| 48 | +
|
| 49 | + Once the preceding command finishes and the pipeline starts, you should see |
| 50 | + output that looks like this: |
| 51 | +
|
| 52 | + .. code-block:: text |
| 53 | + :copyable: false |
| 54 | +
|
| 55 | + ... |
| 56 | + Creating mongo1 ... done |
| 57 | + Creating mongo1 ... done |
| 58 | + Creating zookeeper ... done |
| 59 | + Creating broker ... done |
| 60 | + Creating mongo1-setup ... done |
| 61 | + Creating connect ... done |
| 62 | + Creating shell ... done |
| 63 | +
|
| 64 | + Open a second terminal window. You will use one |
| 65 | + terminal to monitor your topic, and the other terminal to perform write |
| 66 | + operations on your database. Enter the following command into both terminals: |
| 67 | +
|
| 68 | + .. code-block:: bash |
| 69 | +
|
| 70 | + docker exec -it shell /bin/bash |
| 71 | +
|
| 72 | + Once you have entered the preceding command into both terminal windows, your |
| 73 | + terminals should look like: |
| 74 | +
|
| 75 | + .. figure:: /includes/figures/two-shells-after.png |
| 76 | + :alt: Arrangement of two shells for this tutorial. |
| 77 | +
|
| 78 | + Arrange your two terminal windows to match the preceding image so that |
| 79 | + both are visible and one is above the other. |
| 80 | +
|
| 81 | + To monitor your topic, type the following command in your upper terminal window: |
| 82 | +
|
| 83 | + .. code-block:: bash |
| 84 | +
|
| 85 | + kafkacat -b broker:29092 -C -t CDCTutorial.Source |
| 86 | +
|
| 87 | + .. important:: Broker Leader Not Available |
| 88 | +
|
| 89 | + If you receive the following output, run the preceding ``kafkacat`` command |
| 90 | + a second time: |
| 91 | +
|
| 92 | + .. code-block:: |
| 93 | + :copyable: false |
| 94 | +
|
| 95 | + % Error: Topic CDCTutorial.Source error: Broker: Leader not available |
| 96 | +
|
| 97 | + Once you enter the preceding command, you should see output that looks like |
| 98 | + this: |
| 99 | +
|
| 100 | + .. code-block:: text |
| 101 | + :copyable: false |
| 102 | +
|
| 103 | + % Reached end of topic CDCTutorial.Source [0] at offset 0 |
| 104 | +
|
| 105 | + Your upper terminal window is now listening to the ``CDCTutorial.Source`` Kafka |
| 106 | + topic. Changes to your topic will print in this terminal window. |
| 107 | +
|
| 108 | +--- |
| 109 | +title: Configure Sink Connector |
| 110 | +ref: cdc-tutorial-configure-sink |
| 111 | +content: | |
| 112 | + To configure your sink connector, execute the following command in your lower |
| 113 | + terminal window: |
| 114 | +
|
| 115 | + .. code-block:: bash |
| 116 | +
|
| 117 | + curl -X POST -H "Content-Type: application/json" --data ' |
| 118 | + { "name": "mongo-sink-CDCTutorial-eventroundtrip", |
| 119 | + "config": { |
| 120 | + "connector.class":"com.mongodb.kafka.connect.MongoSinkConnector", |
| 121 | + "tasks.max":"1", |
| 122 | + "topics":"CDCTutorial.Source", |
| 123 | + "change.data.capture.handler":"com.mongodb.kafka.connect.sink.cdc.mongodb.ChangeStreamHandler", |
| 124 | + "connection.uri":"mongodb://mongo1:27017,mongo2:27017,mongo3:27017", |
| 125 | + "database":"CDCTutorial", |
| 126 | + "collection":"Destination"} |
| 127 | + }' http://connect:8083/connectors -w "\n" | jq . |
| 128 | +
|
| 129 | + After you run the preceding command, you should see the following output: |
| 130 | +
|
| 131 | + .. code-block:: json |
| 132 | +
|
| 133 | + ... |
| 134 | + { |
| 135 | + "name": "mongo-sink-CDCTutorial-eventroundtrip", |
| 136 | + "config": { |
| 137 | + "connector.class": "com.mongodb.kafka.connect.MongoSinkConnector", |
| 138 | + "tasks.max": "1", |
| 139 | + "topics": "CDCTutorial.Source", |
| 140 | + "change.data.capture.handler": "com.mongodb.kafka.connect.sink.cdc.mongodb.ChangeStreamHandler", |
| 141 | + "connection.uri": "mongodb://mongo1:27017,mongo2:27017,mongo3:27017", |
| 142 | + "database": "CDCTutorial", |
| 143 | + "collection": "Destination", |
| 144 | + "name": "mongo-sink-CDCTutorial-eventroundtrip" |
| 145 | + }, |
| 146 | + "tasks": [], |
| 147 | + "type": "sink" |
| 148 | + } |
| 149 | +
|
| 150 | + This configuration makes your sink connector do the following things: |
| 151 | +
|
| 152 | + - Listen for events on the ``CDCTutorial.Source`` topic |
| 153 | + - Apply a change data capture handler to documents it receives from the |
| 154 | + ``Tutorial.Source`` topic |
| 155 | + - Write received documents to the ``Destination`` collection in the |
| 156 | + ``CDCTutorial`` database |
| 157 | +
|
| 158 | +--- |
| 159 | +title: Change Data in MongoDB |
| 160 | +ref: cdc-tutorial-change-data |
| 161 | +content: | |
| 162 | + From your lower terminal, enter the |
| 163 | + `MongoDB Shell <https://docs.mongodb.com/mongodb-shell/>`__ |
| 164 | + with the following command: |
| 165 | +
|
| 166 | + .. code-block:: bash |
| 167 | +
|
| 168 | + mongosh mongodb://mongo1:27017/?replicaSet=rs0 |
| 169 | +
|
| 170 | + Once you are in the MongoDB Shell, your terminal prompt should look like this: |
| 171 | +
|
| 172 | + .. code-block:: bash |
| 173 | + :copyable: false |
| 174 | +
|
| 175 | + rs0 [primary] test> |
| 176 | +
|
| 177 | + Insert a document into the ``Source`` collection of the ``CDCTutorial`` database |
| 178 | + with the following commands: |
| 179 | +
|
| 180 | + .. code-block :: javascript |
| 181 | +
|
| 182 | + use CDCTutorial |
| 183 | + db.Source.insert({proclaim: "Hello World!"}); |
| 184 | +
|
| 185 | + Once you insert the document, you should see output that resembles the following |
| 186 | + in your upper shell: |
| 187 | + |
| 188 | + .. code-block:: bash |
| 189 | + :copyable: false |
| 190 | +
|
| 191 | + {"schema":{"type":"string","optional":false}, |
| 192 | + "payload":{"_id": {"_data": "8260...4"}, |
| 193 | + "operationType": "insert", |
| 194 | + "clusterTime": {"$timestamp": {"t": 1611348141, "i": 2}}, |
| 195 | + "fullDocument": {"_id": {"$oid": "600b38ad..."}, "proclaim": "Hello World!"}, |
| 196 | + "ns": {"db": "CDCTutorial", "coll": "Source"}, |
| 197 | + "documentKey": {"_id": {"$oid": "600b38a...."}}}} |
| 198 | +
|
| 199 | + In your lower shell, inspect the ``Destination`` collection with the following |
| 200 | + command: |
| 201 | +
|
| 202 | + .. code-block:: javascript |
| 203 | +
|
| 204 | + db.Destination.find() |
| 205 | +
|
| 206 | + You should see output that looks like this: |
| 207 | +
|
| 208 | + .. code-block:: bash |
| 209 | + :copyable: false |
| 210 | +
|
| 211 | + { _id: ..., proclaim: 'Hello World!' } |
| 212 | +
|
| 213 | + Try deleting your document from your ``Source`` collection with the following |
| 214 | + command: |
| 215 | + |
| 216 | + .. code-block:: javascript |
| 217 | +
|
| 218 | + db.Source.deleteMany({}) |
| 219 | +
|
| 220 | + Once you delete the document, you should see output that resembles the following |
| 221 | + in your upper shell: |
| 222 | +
|
| 223 | + .. code-block:: bash |
| 224 | + :copyable: false |
| 225 | +
|
| 226 | + {"schema":{"type":"string","optional":false},"payload":"{\"_id\": |
| 227 | + {\"_data\": \"826138BCBA000000012B022C0100296E5A10041FD232D9ECE347FFABA837E9AB05D95046645F696400646138BCAF2A52D9E0D299336F0004\"}, |
| 228 | + \"operationType\": \"delete\", \"clusterTime\": {\"$timestamp\": {\"t\": |
| 229 | + 1631108282, \"i\": 1}}, \"ns\": {\"db\": \"CDCTutorial\", \"coll\": |
| 230 | + \"Source\"}, \"documentKey\": {\"_id\": {\"$oid\": |
| 231 | + \"6138bcaf2a52d9e0d299336f\"}}}"} |
| 232 | +
|
| 233 | + Now see how many documents are in your ``Destination`` collection: |
| 234 | +
|
| 235 | + .. code-block:: javascript |
| 236 | +
|
| 237 | + db.Destination.count() |
| 238 | +
|
| 239 | + You should see the following output: |
| 240 | +
|
| 241 | + .. code-block:: text |
| 242 | + :copyable: false |
| 243 | +
|
| 244 | + 0 |
| 245 | +
|
| 246 | + Once you have finished exploring the connector in the MongoDB shell, you can |
| 247 | + exit the MongoDB shell with the following command: |
| 248 | +
|
| 249 | + .. code-block:: bash |
| 250 | +
|
| 251 | + exit |
| 252 | +
|
| 253 | + Try and explore the CDC handler on your own. Here are some challenges to get |
| 254 | + you started: |
| 255 | +
|
| 256 | + - Add a second source connector that writes to the ``CDCTutorial.Source`` |
| 257 | + topic. Use a pipeline to have this connector only write insert events. |
| 258 | + - Remove the ``change.data.capture.handler`` from your sink connector. What |
| 259 | + do your documents look like? |
| 260 | + - Use ``kafkacat`` to upload a message to the ``CDCTutorial.Source`` topic |
| 261 | + that isn't a MongoDB Change Event document. What happens? |
| 262 | +
|
| 263 | + <TODO: Link to relevant docs sections from the explore list above > |
| 264 | +
|
| 265 | +--- |
| 266 | +title: Stop the Pipeline |
| 267 | +ref: cdc-tutorial-stop-pipeline |
| 268 | +content: | |
| 269 | + To conserve resources on your computer, make sure to stop the sample pipeline |
| 270 | + once you are done exploring this example. |
| 271 | +
|
| 272 | + Before you stop the sample pipeline, make sure to exit your Docker shell. |
| 273 | + You can exit your Docker shell by running the following command in your lower |
| 274 | + terminal: |
| 275 | +
|
| 276 | + .. code-block:: bash |
| 277 | +
|
| 278 | + exit |
| 279 | +
|
| 280 | + To stop the sample pipeline and remove containers and images, run the following command: |
| 281 | +
|
| 282 | + .. code-block:: bash |
| 283 | +
|
| 284 | + docker-compose -p cdc-tutorial down --rmi 'all' |
0 commit comments