Skip to content

Commit 2b0aa01

Browse files
biniona-mongodbcorryroot
authored andcommitted
(DOCSP-18212) Avro Schema (#133)
Co-authored-by: corryroot <[email protected]>
1 parent 19966e9 commit 2b0aa01

File tree

4 files changed

+154
-25
lines changed

4 files changed

+154
-25
lines changed

source/includes/avro-customers.avro

Lines changed: 21 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,24 @@
11
{
2-
"type":"record",
3-
"name":"Customer",
4-
"fields":[
5-
{
6-
"name":"name",
7-
"type":"string"
8-
},
9-
{
10-
"name":"visits",
11-
"type":{
12-
"type":"array",
13-
"items":{
14-
"type":"long",
15-
"logicalType":"timestamp-millis"
16-
}
17-
}
18-
},
19-
{
20-
"name":"total_purchased",
21-
"type":{
22-
"type":"map",
23-
"values":"int"
24-
}
2+
"type": "record",
3+
"name": "Customer",
4+
"fields": [{
5+
"name": "name",
6+
"type": "string"
7+
},{
8+
"name": "visits",
9+
"type": {
10+
"type": "array",
11+
"items": {
12+
"type": "long",
13+
"logicalType": "timestamp-millis"
14+
}
2515
}
26-
]
16+
},{
17+
"name": "total_purchased",
18+
"type": {
19+
"type": "map",
20+
"values": "int"
21+
}
22+
}
23+
]
2724
}
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
connector.class="com.mongodb.kafka.connect.MongoSourceConnector",
2+
connection.uri=<your mongodb uri>
3+
database=<your database name>
4+
collection="customers"
5+
publish.full.document.only=true
6+
output.format.value="schema"
7+
output.schema.value="{\"type\": \"record\", \"name\": \"Customer\", \"fields\": [{\"name\": \"name\", \"type\": \"string\"}, {\"name\": \"visits\", \"type\": {\"type\": \"array\", \"items\": {\"type\": \"long\", \"logicalType\": \"timestamp-millis\"}}}, {\"name\": \"total_purchased\", \"type\": {\"type\": \"map\", \"values\": \"int\"}}]}"
8+
key.converter="org.apache.kafka.connect.storage.StringConverter"
9+
value.converter="io.confluent.connect.avro.AvroConverter"
10+
value.converter.schema.registry.url=<your schema registry uri>

source/introduction/data-formats.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ Data Formats
66
:titlesonly:
77
:maxdepth: 1
88

9-
9+
Avro Schema </introduction/data-formats/avro-schema>
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
===========
2+
Avro Schema
3+
===========
4+
5+
.. default-domain:: mongodb
6+
7+
.. contents:: On this page
8+
:local:
9+
:backlinks: none
10+
:depth: 2
11+
:class: singlecol
12+
13+
Overview
14+
--------
15+
16+
In this guide, you can learn about Avro schema and how to use it in the
17+
{+mkc+}.
18+
19+
Avro Schema
20+
-----------
21+
22+
<TODO: Group this with Raw JSON and JSON Data formats pages>
23+
24+
Avro schema is a JSON-based schema definition syntax. Avro schema supports the
25+
specification of the following groups of data types:
26+
27+
- `Primitive Types <https://avro.apache.org/docs/current/spec.html#schema_primitive>`__
28+
- `Complex Types <https://avro.apache.org/docs/current/spec.html#schema_complex>`__
29+
- `Logical Types <https://avro.apache.org/docs/current/spec.html#Logical+Types>`__
30+
31+
.. important:: Sink Connectors and Logical Types
32+
33+
{+mkc+} sink connectors support all Avro schema primitive and complex types, however sink
34+
connectors support only the following logical types:
35+
36+
- ``decimal``
37+
- ``date``
38+
- ``time-millis``
39+
- ``time-micros``
40+
- ``timestamp-millis``
41+
- ``timestamp-micros``
42+
43+
For more information on Apache Avro, the open-source project that specifies Avro
44+
schema, see the
45+
`Apache Avro Documentation <https://avro.apache.org/docs/current/index.html>`__.
46+
47+
For a list of all Avro schema types, see the
48+
`Apache Avro specification <https://avro.apache.org/docs/current/spec.html>`__.
49+
50+
Construct and Apply a Schema
51+
----------------------------
52+
53+
<TODO: Move this Content to somewhere in the Source Connector section>
54+
55+
In this section, you can learn how to perform the following actions with Avro
56+
schema and the {+mkc+}:
57+
58+
- :ref:`Construct a schema for a MongoDB collection <avro-schema-construct-schema>`
59+
- :ref:`Apply the schema in a source connector <avro-schema-apply-schema>`
60+
61+
The example in this section references the fictional ``customers`` collection
62+
that contains documents with the following structure:
63+
64+
.. literalinclude:: /includes/avro-customers.json
65+
:language: json
66+
67+
.. _avro-schema-construct-schema:
68+
69+
Construct a Schema for your Collection
70+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71+
72+
The following table shows how to convert the BSON types of documents in the ``customers``
73+
collection into Avro schema types:
74+
75+
.. list-table::
76+
:header-rows: 1
77+
:widths: 33 33 34
78+
79+
* - Field Name
80+
- BSON Type
81+
- Avro Schema Type
82+
83+
* - ``name``
84+
- ``String`` type
85+
- `string <https://avro.apache.org/docs/current/spec.html#schema_primitive>`__ primitive type
86+
87+
* - ``visits``
88+
- ``Array`` type holding ``Date`` type values
89+
- `array <https://avro.apache.org/docs/current/spec.html#Arrays>`__
90+
complex type holding
91+
`timestamp-millis
92+
<https://avro.apache.org/docs/current/spec.html#Timestamp+%28millisecond+precision%29>`__
93+
logical type values
94+
95+
* - ``total_purchased``
96+
- ``Object`` type holding ``Int32`` type values
97+
- `map <https://avro.apache.org/docs/current/spec.html#Maps>`__ complex
98+
type with `int <https://avro.apache.org/docs/current/spec.html#schema_primitive>`__
99+
primitive type values.
100+
101+
The complete schema for the ``customers`` collection looks like this:
102+
103+
.. literalinclude:: /includes/avro-customers.avro
104+
:language: json
105+
106+
.. _avro-schema-apply-schema:
107+
108+
Apply the Schema in your Source Connector
109+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
110+
111+
The following properties file configures your {+mkc+} source connector to apply
112+
your schema to incoming documents:
113+
114+
.. literalinclude:: /includes/properties-files/avro-schema/avro-source.properties
115+
:language: java
116+
117+
For more information on applying schemas in the {+mkc+}, see our guide on
118+
applying schemas.
119+
<TODO: Link to Schema Guide>
120+
121+
For more information on the Avro converter, see our guide on converters.
122+
<TODO: Link to Avro Converter>

0 commit comments

Comments
 (0)