Skip to content

Commit 902e4ff

Browse files
biniona-mongodbschmalliso
authored andcommitted
(DOCSP-15804) CDC-Handler (#158)
1 parent 41413fc commit 902e4ff

File tree

6 files changed

+254
-8
lines changed

6 files changed

+254
-8
lines changed
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
.. _cdc-debezium-example:
2+
3+
Your sink connector can replicate Debezium CDC events originating from these datastores:
4+
5+
- MongoDB
6+
- Postgres
7+
- MySQL
8+
9+
Click the following tabs to see how to configure the Debezium CDC handler to replicate
10+
CDC events from each of the preceding datastores:
11+
12+
.. tabs::
13+
14+
.. tab::
15+
:tabid: MongoDB
16+
17+
The following properties file configures a sink connector to replicate
18+
Debezium CDC events corresponding to changes in a MongoDB instance:
19+
20+
.. code-block:: properties
21+
:emphasize-lines: 6
22+
23+
connector.class=com.mongodb.kafka.connect.sink.MongoSinkConnector
24+
connection.uri=<your connection uri>
25+
database=<your mongodb database>
26+
collection=<your mongodb collection>
27+
topics=<topic containing debezium cdc events>
28+
change.data.capture.handler=com.mongodb.kafka.connect.sink.cdc.debezium.mongodb.MongoDbHandler
29+
30+
To view the source code for the Debezium CDC handler, see
31+
:github:`the {+mkc+} source code </mongodb/mongo-kafka/tree/master/src/main/java/com/mongodb/kafka/connect/sink/cdc/debezium>`.
32+
33+
.. tab::
34+
:tabid: Postgres
35+
36+
The following properties file configures a sink connector to replicate
37+
Debezium CDC events corresponding to changes in a Postgres instance:
38+
39+
.. code-block:: properties
40+
:emphasize-lines: 6
41+
42+
connector.class=com.mongodb.kafka.connect.sink.MongoSinkConnector
43+
connection.uri=<your connection uri>
44+
database=<your mongodb database>
45+
collection=<your mongodb collection>
46+
topics=<topic containing debezium cdc events>
47+
change.data.capture.handler=com.mongodb.kafka.connect.sink.cdc.debezium.rdbms.postgres.PostgresHandler
48+
49+
To view the source code for the Debezium CDC handler, see
50+
:github:`the {+mkc+} source code <mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/cdc/debezium/rdbms/postgres/PostgresHandler.java>`.
51+
52+
.. tab::
53+
:tabid: MySQL
54+
55+
The following properties file configures a sink connector to replicate
56+
Debezium CDC events corresponding to changes in a MySQL instance:
57+
58+
.. code-block:: properties
59+
:emphasize-lines: 6
60+
61+
connector.class=com.mongodb.kafka.connect.sink.MongoSinkConnector
62+
connection.uri=<your connection uri>
63+
database=<your mongodb database>
64+
collection=<your mongodb collection>
65+
topics=<topic containing debezium cdc events>
66+
change.data.capture.handler=com.mongodb.kafka.connect.sink.cdc.debezium.rdbms.mysql.MysqlHandler
67+
68+
To view the source code for the Debezium CDC handler, see
69+
:github:`the {+mkc+} source code <mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/cdc/debezium/rdbms/mysql/MysqlHandler.java>`.
70+
71+
.. note:: Customize the Debezium CDC Handler
72+
73+
If the Debezium CDC handler is unable to replicate CDC events
74+
from your datastore, you can customize the handler by extending the
75+
:github:`DebeziumCdcHandler <mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/cdc/debezium/DebeziumCdcHandler.java>`
76+
class. For more information on custom CDC handlers, see the
77+
:ref:`Create your Own CDC Handler section <cdc-create-your-own>` of this guide.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
.. _cdc-mongodb-example:
2+
3+
The following properties file configures a sink connector to replicate
4+
MongoDB change event documents:
5+
6+
.. code-block:: properties
7+
:emphasize-lines: 6
8+
9+
connector.class=com.mongodb.kafka.connect.MongoSinkConnector
10+
connection.uri=<your connection uri>
11+
database=<your database>
12+
collection=<your collection>
13+
topics=<topic containing mongodb change event documents>
14+
change.data.capture.handler=com.mongodb.kafka.connect.sink.cdc.mongodb.ChangeStreamHandler
15+
16+
To view the source code for the MongoDB CDC handler, see
17+
:github:`the {+mkc+} source code <mongodb/mongo-kafka/tree/master/src/main/java/com/mongodb/kafka/connect/sink/cdc/mongodb>`.
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
.. _cdc-qlik-replicate-example:
2+
3+
Your sink connector can replicate Qlik Replicate CDC events originating from these
4+
datastores:
5+
6+
- OracleDB
7+
- MySQL
8+
- Postgres
9+
10+
The following properties file configures a sink connector to replicate
11+
Qlik Replicate CDC events:
12+
13+
.. code-block:: properties
14+
:emphasize-lines: 6
15+
16+
connector.class=com.mongodb.kafka.connect.MongoSinkConnector
17+
connection.uri=<your connection uri>
18+
database=<your database>
19+
collection=<your collection>
20+
topics=<topic containing qlik replicate cdc events>
21+
change.data.capture.handler=com.mongodb.kafka.connect.sink.cdc.qlik.rdbms.RdbmsHandler
22+
23+
To view the source code for the Qlik Replicate CDC handler, see
24+
:github:`the {+mkc+} source code </mongodb/mongo-kafka/tree/master/src/main/java/com/mongodb/kafka/connect/sink/cdc/qlik/rdbms/RdbmsHandler.java>`.
25+
26+
.. note:: Customize the Qlik Replicate CDC Handler
27+
28+
If the Qlik Replicate CDC handler is unable to replicate CDC events
29+
from your datastore, you can customize the handler by extending the
30+
:github:`QlikCdcHandler <mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/cdc/qlik/QlikCdcHandler.java>`
31+
class. For more information on custom CDC handlers, see the
32+
:ref:`Create your Own CDC Handler section <cdc-create-your-own>` of this guide.

source/sink-connector/fundamentals.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,6 @@ Fundamentals
1010
Post-processors </sink-connector/fundamentals/post-processors>
1111
Data Transformations </sink-connector/fundamentals/data-transformations>
1212
Error Handling Strategies </sink-connector/fundamentals/error-handling-strategies>
13-
Change Data Capture </sink-connector/fundamentals/change-data-capture>
13+
Change Data Capture Handlers </sink-connector/fundamentals/change-data-capture>
1414

1515
asdf

source/sink-connector/fundamentals/change-data-capture.txt

Lines changed: 124 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,125 @@
1-
==================================
2-
Sink Connector Change Data Capture
3-
==================================
1+
============================
2+
Change Data Capture Handlers
3+
============================
4+
5+
.. default-domain:: mongodb
6+
7+
.. contents:: On this page
8+
:local:
9+
:backlinks: none
10+
:depth: 2
11+
:class: singlecol
12+
13+
Overview
14+
--------
15+
16+
Learn how to **replicate** your **change data capture (CDC)** events with a {+mkc+} sink
17+
connector. CDC is a software architecture that converts changes in a datastore
18+
into a stream of **CDC events**. A CDC event is a message containing a
19+
reproducible representation of a change performed on a datastore. Replicating
20+
data is the process of applying the changes contained in CDC events from one data
21+
store onto a different datastore so that the changes occur in both datastores.
22+
23+
Use a **CDC handler** to replicate CDC events stored on an {+ak+} topic into MongoDB.
24+
A CDC handler is a program that translates CDC events from a specific
25+
**CDC event producer** into MongoDB write operations.
26+
27+
A CDC event producer is an application that generates CDC events. CDC event
28+
producers can be datastores, or applications that watch datastores and generate
29+
CDC events corresponding to changes in the datastores.
30+
31+
.. note::
32+
33+
MongoDB change streams is an example of a CDC architecture. To learn more about
34+
change streams, see
35+
:doc:`the {+mkc+} guide on Change Streams </source-connector/fundamentals/change-streams>`.
36+
37+
If you would like to view a tutorial demonstrating how to replicate data, see the
38+
:doc:`Replicate Data With a Change Data Capture Handler tutorial </tutorials/replicate-with-cdc>`.
39+
40+
Specify a CDC Handler
41+
---------------------
42+
43+
You can specify a CDC handler on your sink connector with the following configuration option:
44+
45+
.. code-block:: properties
46+
47+
change.data.capture.handler=<cdc handler class>
48+
49+
To learn more, see
50+
:doc:`change data capture configuration options </sink-connector/configuration-properties/cdc>`
51+
in the {+mkc+}.
52+
53+
Available CDC Handlers
54+
~~~~~~~~~~~~~~~~~~~~~~
55+
56+
The {+mkc+} provides CDC handlers for the following CDC event producers:
57+
58+
- MongoDB
59+
- `Debezium <https://debezium.io/>`__
60+
- `Qlik Replicate <https://www.qlik.com/us/products/qlik-replicate>`__
61+
62+
Click the following tabs to learn how to configure
63+
CDC handlers for the preceding event producers:
64+
65+
.. tabs::
66+
67+
.. tab::
68+
:tabid: MongoDB
69+
70+
.. include:: /includes/fundamentals/cdc/mongodb.rst
71+
72+
.. tab::
73+
:tabid: Debezium
74+
75+
.. include:: /includes/fundamentals/cdc/debezium.rst
76+
77+
.. tab::
78+
:tabid: Qlik Replicate
79+
80+
.. include:: /includes/fundamentals/cdc/qlik.rst
81+
82+
.. _cdc-create-your-own:
83+
84+
Create Your Own CDC Handler
85+
---------------------------
86+
87+
If none of the prebuilt CDC handlers fit your use case, you can create your own.
88+
Your custom CDC handler is a Java class that implements the ``CdcHandler`` interface.
89+
90+
To learn more, see the
91+
:github:`source code for the CdcHandler interface <mongodb/mongo-kafka/blob/master/src/main/java/com/mongodb/kafka/connect/sink/cdc/CdcHandler.java>`.
92+
93+
To view examples of CDC handler implementations, see
94+
:github:`the source code for the prebuilt CDC handlers <mongodb/mongo-kafka/tree/master/src/main/java/com/mongodb/kafka/connect/sink/cdc>`.
95+
96+
How to Use Your CDC Handler
97+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
98+
99+
To configure your sink connector to use your custom CDC Handler, you must perform the
100+
following actions:
101+
102+
#. Compile your custom CDC handler class to a JAR file.
103+
104+
#. Add the compiled JAR to the classpath/plugin path for your Kafka workers.
105+
For more information about plugin paths, see the `Confluent documentation
106+
<https://docs.confluent.io/current/connect/managing/community.html>`__.
107+
108+
.. note::
109+
110+
Kafka Connect loads plugins in isolation. When you deploy a custom write
111+
strategy, both the connector JAR and the CDC handler
112+
JAR should be on the same path. Your paths should resemble the following:
113+
114+
| ``<plugin.path>/mongo-kafka-connect/mongo-kafka-connect-all.jar``
115+
| ``<plugin.path>/mongo-kafka-connect/custom-CDC-handler.jar``
116+
117+
To learn more about {+kc+} plugins, see
118+
`this guide from Confluent <https://docs.confluent.io/home/connect/userguide.html#installing-kconnect-plugins>`__.
119+
120+
#. Specify your custom class in the ``change.data.capture.handler``
121+
:ref:`configuration setting <kafka-sink-properties>`.
122+
123+
To learn how to compile a class to a JAR file,
124+
`see this guide from Oracle <https://docs.oracle.com/javase/tutorial/deployment/jar/build.html>`__.
4125

5-
TODO:

source/tutorials/replicate-with-cdc.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
===================================================
2-
Replicate Data With the Change Data Capture Handler
3-
===================================================
1+
=================================================
2+
Replicate Data with a Change Data Capture Handler
3+
=================================================
44

55
.. default-domain:: mongodb
66

0 commit comments

Comments
 (0)