Skip to content

Commit 8f9b0dc

Browse files
committed
first draft
1 parent cc2db82 commit 8f9b0dc

File tree

5 files changed

+258
-152
lines changed

5 files changed

+258
-152
lines changed

source/data-formats/extended-json.txt

Lines changed: 72 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,57 @@ list of dictionaries by using the ``loads()`` method:
178178
{'bin': Binary(b'\x01\x02\x03\x04', 128)}
179179
]
180180

181+
.. _pymongo-extended-json-binary-values:
182+
183+
Reading Binary Values in Python 2
184+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
185+
186+
In Python 3, the driver decodes JSON binary values with subtype 0 to instances of the
187+
``bytes`` class. In Python 2, the driver decodes these values to instances of the ``Binary``
188+
class with subtype 0.
189+
190+
The following code examples show how {+driver-short+} decodes JSON binary isntances with
191+
subtype 0. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the
192+
corresponding code.
193+
194+
.. tabs::
195+
196+
.. tab:: Python 2
197+
:tabid: python2
198+
199+
.. io-code-block::
200+
:copyable: true
201+
202+
.. input::
203+
:language: python
204+
205+
from bson.json_util import loads
206+
207+
doc = loads('{"b": {"$binary': b'this is a byte string'})
208+
print(doc)
209+
210+
.. output::
211+
212+
{u'b': Binary('this is a byte string', 0)}
213+
214+
.. tab:: Python 3
215+
:tabid: python3
216+
217+
.. io-code-block::
218+
:copyable: true
219+
220+
.. input::
221+
:language: python
222+
223+
from bson.json_util import loads
224+
225+
doc = loads('{"b": {"$binary': b'this is a byte string'})
226+
print(doc)
227+
228+
.. output::
229+
230+
{'b': b'this is a byte string'}
231+
181232
Write Extended JSON
182233
-------------------
183234

@@ -273,10 +324,30 @@ The following example shows how to output Extended JSON in the Canonical format:
273324
Additional Information
274325
----------------------
275326

327+
The resources in the following sections provide more information about working
328+
with Extended JSON.
329+
330+
API Documentation
331+
~~~~~~~~~~~~~~~~~
332+
276333
For more information about the methods and types in ``bson.json_util``, see the following
277334
API documentation:
278335

279336
- `loads() <{+api-root+}bson/json_util.html#bson.json_util.loads>`__
280337
- `dumps() <{+api-root+}bson/json_util.html#bson.json_util.dumps>`__
281338
- `CANONICAL_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.CANONICAL_JSON_OPTIONS>`__
282-
- `LEGACY_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.LEGACY_JSON_OPTIONS>`__
339+
- `LEGACY_JSON_OPTIONS <{+api-root+}bson/json_util.html#bson.json_util.LEGACY_JSON_OPTIONS>`__
340+
341+
Other Packages
342+
~~~~~~~~~~~~~~
343+
344+
`python-bsonjs <https://pypi.python.org/pypi/python-bsonjs>`__ is another package,
345+
built on top of `libbson <https://github.com/mongodb/libbson>`__,
346+
that can convert BSON to Extended JSON. The ``python-bsonjs`` package doesn't
347+
depend on {+driver-short+} and might offer a performance improvement over
348+
``json_util`` in certain cases.
349+
350+
.. tip:: Use the RawBSONDocument Type
351+
352+
``python-bsonjs`` works best with {+driver-short+} when converting from the
353+
``RawBSONDocument`` type.

source/faq.txt

Lines changed: 0 additions & 148 deletions
Original file line numberDiff line numberDiff line change
@@ -1,148 +0,0 @@
1-
.. docs-landing/source/languages/python.txt
2-
3-
Can {+driver-short+} Load the Results of a Query as a Pandas DataFrame?
4-
-----------------------------------------------------------------------
5-
6-
You can use the `PyMongoArrow <https://www.mongodb.com/docs/languages/python/pymongo-arrow-driver/current/>`__
7-
library to work with numerical or columnar data. PyMongoArrow lets you
8-
load MongoDB query result-sets as
9-
`Pandas DataFrames <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html>`__,
10-
`NumPy ndarrays <https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html>`__, or
11-
`Apache Arrow Tables <https://arrow.apache.org/docs/python/generated/pyarrow.Table.html>`__.
12-
13-
How Can I Encode My Documents to JSON?
14-
--------------------------------------
15-
16-
{+driver-short+} supports some special types, like ``ObjectId``
17-
and ``DBRef``, that aren't supported in JSON. Therefore, Python's ``json`` module won't
18-
work with all documents in {+driver-short+}. Instead, {+driver-short+} includes the
19-
`json_util <https://pymongo.readthedocs.io/en/latest/api/bson/json_util.html>`__
20-
module, a tool for using Python's ``json`` module with BSON documents and
21-
`MongoDB Extended JSON <https://mongodb.com/docs/manual/reference/mongodb-extended-json/>`__.
22-
23-
`python-bsonjs <https://pypi.python.org/pypi/python-bsonjs>`__ is another
24-
BSON-to-MongoDB-Extended-JSON converter, built on top of
25-
`libbson <https://github.com/mongodb/libbson>`__. python-bsonjs doesn't
26-
depend on {+driver-short+} and might offer a performance improvement over
27-
``json_util`` in certain cases.
28-
29-
.. tip::
30-
31-
python-bsonjs works best with {+driver-short+} when using the ``RawBSONDocument``
32-
type.
33-
34-
Does {+driver-short+} Behave Differently in Python 3?
35-
-----------------------------------------------------
36-
37-
{+driver-short+} encodes instances of the ``bytes`` class
38-
as BSON type 5 (binary data) with subtype 0.
39-
In Python 2, these instances are decoded to ``Binary``
40-
with subtype 0. In Python 3, they are decoded back to ``bytes``.
41-
42-
The following code examples use {+driver-short+} to insert a ``bytes`` instance
43-
into MongoDB, and then find the instance.
44-
In Python 2, the byte string is decoded to ``Binary``.
45-
In Python 3, the byte string is decoded back to ``bytes``.
46-
47-
.. tabs::
48-
49-
.. tab:: Python 2.7
50-
:tabid: python-2
51-
52-
.. code-block:: python
53-
54-
>>> import pymongo
55-
>>> c = pymongo.MongoClient()
56-
>>> c.test.bintest.insert_one({'binary': b'this is a byte string'}).inserted_id
57-
ObjectId('4f9086b1fba5222021000000')
58-
>>> c.test.bintest.find_one()
59-
{u'binary': Binary('this is a byte string', 0), u'_id': ObjectId('4f9086b1fba5222021000000')}
60-
61-
.. tab:: Python 3.7
62-
:tabid: python-3
63-
64-
.. code-block:: python
65-
66-
>>> import pymongo
67-
>>> c = pymongo.MongoClient()
68-
>>> c.test.bintest.insert_one({'binary': b'this is a byte string'}).inserted_id
69-
ObjectId('4f9086b1fba5222021000000')
70-
>>> c.test.bintest.find_one()
71-
{'binary': b'this is a byte string', '_id': ObjectId('4f9086b1fba5222021000000')}
72-
73-
Similarly, Python 2 and 3 behave differently when {+driver-short+} parses JSON binary
74-
values with subtype 0. In Python 2, these values are decoded to instances of ``Binary``
75-
with subtype 0. In Python 3, they're decoded into instances of ``bytes``.
76-
77-
The following code examples use the ``json_util`` module to decode a JSON binary value
78-
with subtype 0. In Python 2, the byte string is decoded to ``Binary``.
79-
In Python 3, the byte string is decoded back to ``bytes``.
80-
81-
.. tabs::
82-
83-
.. tab:: Python 2.7
84-
:tabid: python-2
85-
86-
.. code-block:: python
87-
88-
>>> from bson.json_util import loads
89-
>>> loads('{"b": {"$binary": "dGhpcyBpcyBhIGJ5dGUgc3RyaW5n", "$type": "00"}}')
90-
{u'b': Binary('this is a byte string', 0)}
91-
92-
.. tab:: Python 3.7
93-
:tabid: python-3
94-
95-
.. code-block:: python
96-
97-
>>> from bson.json_util import loads
98-
>>> loads('{"b": {"$binary": "dGhpcyBpcyBhIGJ5dGUgc3RyaW5n", "$type": "00"}}')
99-
{'b': b'this is a byte string'}
100-
101-
Can I Share Pickled ObjectIds Between Python 2 and Python 3?
102-
------------------------------------------------------------
103-
104-
If you use Python 2 to pickle an instance of ``ObjectId``,
105-
you can always unpickle it with Python 3. To do so, you must pass
106-
the ``encoding='latin-1'`` option to the ``pickle.loads()`` method.
107-
The following code example shows how to pickle an ``ObjectId`` in Python 2.7, and then
108-
unpickle it in Python 3.7:
109-
110-
.. code-block:: python
111-
:emphasize-lines: 12
112-
113-
# Python 2.7
114-
>>> import pickle
115-
>>> from bson.objectid import ObjectId
116-
>>> oid = ObjectId()
117-
>>> oid
118-
ObjectId('4f919ba2fba5225b84000000')
119-
>>> pickle.dumps(oid)
120-
'ccopy_reg\n_reconstructor\np0\n(cbson.objectid\...'
121-
122-
# Python 3.7
123-
>>> import pickle
124-
>>> pickle.loads(b'ccopy_reg\n_reconstructor\np0\n(cbson.objectid\...', encoding='latin-1')
125-
ObjectId('4f919ba2fba5225b84000000')
126-
127-
If you pickled an ``ObjectID`` in Python 2, and want to unpickle it in Python 3,
128-
you must pass the ``protocol`` argument with a value of ``2`` or less to the
129-
``pickle.dumps()`` method.
130-
The following code example shows how to pickle an ``ObjectId`` in Python 3.7, and then
131-
unpickle it in Python 2.7:
132-
133-
.. code-block:: python
134-
:emphasize-lines: 7
135-
136-
# Python 3.7
137-
>>> import pickle
138-
>>> from bson.objectid import ObjectId
139-
>>> oid = ObjectId()
140-
>>> oid
141-
ObjectId('4f96f20c430ee6bd06000000')
142-
>>> pickle.dumps(oid, protocol=2)
143-
b'\x80\x02cbson.objectid\nObjectId\nq\x00)\x81q\x01c_codecs\nencode\...'
144-
145-
# Python 2.7
146-
>>> import pickle
147-
>>> pickle.loads('\x80\x02cbson.objectid\nObjectId\nq\x00)\x81q\x01c_codecs\nencode\...')
148-
ObjectId('4f96f20c430ee6bd06000000')

source/includes/language-compatibility-table-pymongo.rst

Lines changed: 134 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -196,5 +196,137 @@ Python 3
196196
Python 2
197197
~~~~~~~~
198198

199-
{+driver-short+} versions 3.7 through 3.12 are compatible with Python 2.7 and PyPy, a Python 2.7-
200-
compatible alternative interpreter.
199+
{+driver-short+} versions 3.7 through 3.12 are compatible with Python 2.7 and PyPy, a
200+
Python 2.7-compatible alternative interpreter. However, in some cases, {+driver-short+}
201+
applications behave differently when running in a Python 2 environment.
202+
203+
The following sections describe the differences in behavior between Python 2 and Python 3
204+
when using {+driver-short+}.
205+
206+
Binary Data
207+
```````````
208+
209+
In all versions of Python, {+driver-short+} encodes instances of the
210+
`bytes <https://docs.python.org/3/library/stdtypes.html#bytes>`__ class
211+
as binary data with subtype 0, the default subtype for binary data. In Python 3,
212+
{+driver-short+} decodes these values to instances of the ``bytes`` class. In Python 2,
213+
the driver decodes them to instances of the
214+
`Binary <https://pymongo.readthedocs.io/en/4.11/api/bson/binary.html#bson.binary.Binary>`__
215+
class with subtype 0.
216+
217+
The following code examples show how {+driver-short+} decodes instances of the ``bytes``
218+
class. Select the :guilabel:`Python 2` or :guilabel:`Python 3` tab to view the corresponding
219+
code.
220+
221+
.. tabs::
222+
223+
.. tab:: Python 2
224+
:tabid: python2
225+
226+
.. io-code-block::
227+
:copyable: true
228+
229+
.. input::
230+
:language: python
231+
232+
from pymongo import MongoClient
233+
234+
client = MongoClient()
235+
client.test.test.insert_one({'binary': b'this is a byte string'})
236+
doc = client.test.test.find_one()
237+
print(doc)
238+
239+
.. output::
240+
241+
{u'_id': ObjectId('67afb78298f604a28f0247b4'), u'binary': Binary('this is a byte string', 0)}
242+
243+
.. tab:: Python 3
244+
:tabid: python3
245+
246+
.. io-code-block::
247+
:copyable: true
248+
249+
.. input::
250+
:language: python
251+
252+
from pymongo import MongoClient
253+
254+
client = MongoClient()
255+
client.test.test.insert_one({'binary': b'this is a byte string'})
256+
doc = client.test.test.find_one()
257+
print(doc)
258+
259+
.. output::
260+
261+
{'_id': ObjectId('67afb78298f604a28f0247b4'), 'binary': b'this is a byte string'}
262+
263+
The driver behaves the same way when decoding JSON binary values with subtype 0. In
264+
Python 3, it decodes these values to instances of the ``bytes`` class. In Python 2,
265+
the driver decodes them to instances of the ``Binary`` class with subtype 0. For code
266+
examples that show the differences, see the
267+
:ref:`Extended JSON <pymongo-extended-json-binary-values>` page.
268+
269+
Pickled ObjectIds
270+
`````````````````
271+
272+
If you pickled an ``ObjectId`` in Python 2 and want to unpickle it in Python 3, you must
273+
pass ``encoding='latin-1'`` as an argument to the ``pickle.loads()`` method.
274+
275+
The following example shows how to use Python 3 to unpickle an ``ObjectId`` that was
276+
pickled in Python 2:
277+
278+
.. code-block:: python
279+
:emphasize-lines: 2
280+
281+
import pickle
282+
pickle.loads(b'<ObjectId byte stream>', encoding='latin-1')
283+
284+
If a Python 3 application uses a compatible serialization protocol to pickle an ``ObjectId``,
285+
you can use Python 2 to unpickle it. To specify a compatible protocol in Python 3, pass
286+
a value of 0, 1, or 2 for the ``protocol`` parameter of the ``pickle.dumps()`` method.
287+
288+
The following example pickles an ``ObjectId`` in Python 3, then prints the ``ObjectId``
289+
and resulting ``bytes`` instance:
290+
291+
.. io-code-block::
292+
:copyable: true
293+
294+
.. input::
295+
:language: python
296+
297+
import pickle
298+
from bson.objectid import ObjectId
299+
300+
oid = ObjectId()
301+
oid_bytes = pickle.dumps(oid, protocol=2)
302+
print("ObjectId: {}".format(oid))
303+
print("ObjectId bytes: {}".format(oid_bytes))
304+
305+
.. output::
306+
:language: shell
307+
308+
ObjectId: 67af9b1fae9260c0e97eb9eb
309+
ObjectId bytes: b'\x80\x02cbson.objectid\nObjectId\nq\x00...
310+
311+
The following example unpickles the ``ObjectId`` from the previous example, and then
312+
prints the ``bytes`` and ``ObjectId`` instances:
313+
314+
.. io-code-block::
315+
:copyable: true
316+
317+
.. input::
318+
:language: python
319+
320+
import pickle
321+
from bson.objectid import ObjectId
322+
323+
oid_bytes = b'\x80\x02cbson.objectid\nObjectId\nq\x00...'
324+
oid = pickle.loads(oid_bytes)
325+
print("ObjectId bytes: {}".format(oid_bytes))
326+
print("ObjectId: {}".format(oid))
327+
328+
.. output::
329+
:language: shell
330+
331+
ObjectId bytes: b'\x80\x02cbson.objectid\nObjectId\nq\x00)...
332+
ObjectId: 67af9b1fae9260c0e97eb9eb

0 commit comments

Comments
 (0)