Skip to content

DOCSP-46701: Serialization #168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 10, 2025
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions source/index.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ MongoDB {+driver-short+} Documentation
Data Formats </data-formats>
Logging </logging>
Monitoring </monitoring>
Serialization </serialization>
Third-Party Tools </tools>
FAQ </faq>
Troubleshooting </troubleshooting>
Expand Down Expand Up @@ -100,6 +101,22 @@ Specialized Data Formats
Learn how to work with specialized data formats and custom types in the
:ref:`pymongo-data-formats` section.

Logging
-------

Learn how to configure logging in the :ref:`pymongo-logging` section.

Monitoring
----------

Learn how to monitor changes to your application in the :ref:`pymongo-monitoring` section.

Serialization
-------------

Learn how {+driver-short+} serializes and deserializes data in the
:ref:`pymongo-serialization` section.

Third-Party Tools
-----------------

Expand Down
123 changes: 123 additions & 0 deletions source/serialization.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
.. _pymongo-serialization:

=============
Serialization
=============

.. facet::
:name: genre
:values: reference

.. meta::
:keywords: class, map, deserialize

.. contents:: On this page
:local:
:backlinks: none
:depth: 2
:class: singlecol

Overview
--------

In this guide, you can learn how to use {+driver-short+} to perform
serialization.

Serialization is the process of mapping a {+language+} object to a BSON
document for storage in MongoDB. {+driver-short+} automatically converts basic {+language+}
types into BSON when you insert a document into a collection. Similarly, when you retrieve a
document from a collection, {+driver-short+} automatically converts the returned BSON
back into the corresponding {+language+} types.

The following list shows some {+language+} types that {+driver-short+} can serialize
and deserialize:

- ``str``
- ``int``
- ``float``
- ``bool``
- ``datetime.datetime``
- ``list``
- ``dict``
- ``None``

For a complete list of {+language+}-to-BSON mappings, see the `bson <{+api-root+}bson/index.html>`__
API documentation.

Custom Classes
--------------

To serialize and deserialize custom {+language+} classes, you must implement custom logic

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to explicitly call out that serialization and deserialization are required in order to work with custom data classes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my intention with this sentence, but if you feel it's not clear enough I can make it more explicit.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's clear enough.

to handle the conversion. The following sections show how to serialize and deserialize
custom classes.

Serializing Custom Classes
~~~~~~~~~~~~~~~~~~~~~~~~~~

To serialize a custom class, you must convert the class to a dictionary. The following
example serializes a custom class by using the ``vars()`` method, then inserts the
serialized object into a collection:

.. code-block:: python

class Restaurant:
def __init__(self, name, cuisine):
self.name = name
self.cuisine = cuisine

restaurant = Guitar("Example Cafe", "Coffee")
restaurant_dict = vars(restaurant)

collection.insert_one(restaurant_dict)

To learn more about inserting documents into a collection, see the :ref:`pymongo-write-insert`
guide.

Deserializing Custom Classes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To deserialize a custom class, you must convert the dictionary back into an instance of
the class. The following example retrieves a document from a collection, then converts
it back into a ``Restaurant`` object from the preceding example:

.. code-block:: python

def deserialize_restaurant(doc):
return Restaurant(name=doc["name"], cuisine=doc["cuisine"])

restaurant_doc = collection.find_one({"name": "Example Cafe"})
restaurant = deserialize_restaurant(restaurant_doc)

To learn more about retrieving documents from a collection, see the :ref:`pymongo-retrieve`
guide.

Serializing Ordered Documents
-----------------------------

Because the key-value pairs in {+language+} dictionaries are unordered, the order of

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All versions of Python we support use ordered dictionaries where the order of inserted keys determines the order of the dictionary. Is the behavior described here still occurring, or is it an artifact of older Python versions that still had unordered dictionaries?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed the info from this FAQ page. However, I see that this page was pulled from PyMongo docs, which leads me to believe they haven't been updated in a while.

Should I just remove this section completely?

Copy link

@NoahStapp NoahStapp Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense. Yeah, that section of the troubleshooting FAQ should be removed for no longer being accurate.

This whole section can be removed, yup!

fields in serialized BSON documents can differ from the order of fields in the original
dictionary. This behavior can cause issues when {+driver-short+} compares subdocuments
to each other, since {+driver-short+} only considers subdocuments to be equal if their fields
are in identical order.

To preserve the order of keys when serializing and deserializing BSON,
use the `SON <{+api-root+}bson/son.html>`__ class. You must also configure your collection
to use SON for serialization and deserialization by specifying ``document_class=SON``
to the ``with_options()`` method of a collection.

The following example retrieves a document
that has a ``location`` field value of ``{"street": "Cafe St", "zipcode": "10003"}`` from
the ``restaurants`` collection:

.. code-block:: python

from bson import CodecOptions, SON

opts = CodecOptions(document_class=SON)
collection = db.get_collection("restaurants")
son_collection = collection.with_options(codec_options=opts)
doc = son_collection.find_one({"location": SON([("street", "Cafe St"), ("zipcode", "10003")])})

For more information about subdocument matching, see the
:manual:`Query on Embedded/Nested Documents </tutorial/query-embedded-documents/>`
guide in the {+mdb-server+} documentation.
Loading