Skip to content

Cython bindings for libbson #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 4, 2021
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions bindings/python/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,5 @@ MANIFEST
*.c
*.cpp

# PyCharm
*.idea/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to keep this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted.

# Sphinx documentation
docs/_build/
8 changes: 8 additions & 0 deletions bindings/python/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
============
PyMongoArrow
============
:Info: A companion library to PyMongo that makes it easy to move data
between MongoDB and Apache Arrow. See
`GitHub <https://github.com/mongodb-labs/mongo-arrow/tree/main/bindings/python>`_
for the latest source.
:Author: Prashant Mital
31 changes: 31 additions & 0 deletions bindings/python/pymongoarrow/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Copyright 2021-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from pymongoarrow.libbson.version import __version__ as libbson_version
from pymongoarrow.version import __version__, _MIN_LIBBSON_VERSION


try:
from pkg_resources import parse_version as _parse_version
except ImportError:
from distutils.version import LooseVersion as _LooseVersion

def _parse_version(version):
return _LooseVersion(version)


if _parse_version(libbson_version) < _parse_version(_MIN_LIBBSON_VERSION):
raise RuntimeError(
"Expected libbson version {} or greater, found {}}".format(
_MIN_LIBBSON_VERSION, libbson_version))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the library at import time like this can cause some problems down the line. Imagine when we want to integrate this library with pymongo. Then pymongo will look like this:

try:
    import pymongoarrow
except ImportError:
    pass  # not installed 

With the current design we need to catch RuntimeError as well which is a bit odd:

try:
    import pymongoarrow
except (ImportError, RuntimeError):
    pass  # not installed 

I suggest we defer calling into the library until it's first used. For example:

__version__ = bson_get_version().decode('utf-8')

Would become:

def libbson_version():
    return bson_get_version().decode('utf-8')

Feel free to open a follow up ticket if you like to merge this as is and discuss this issue further in Jira.

Edit: a simple alternative option might be to change this line to raise an ImportError instead of a RuntimeError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point except there is no one point of entry for loading libbson in this case so even if the version check is done lazily, it would have to be explicitly added to every possible method in the libbson bindings that could be directly called by external code.

I have changed it to raise an ImportError instead.

129 changes: 129 additions & 0 deletions bindings/python/pymongoarrow/libbson/__init__.pxd
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Copyright 2021-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Cython compiler directives
# cython: language_level=3
# distutils: language=c
from libc.stdint cimport int32_t, int64_t, uint8_t, uint32_t, uint64_t


# libbson type wrappings
# wrappings are not defined for the following types:
# - bson_json_reader_t
# - bson_md5_t (deprecated)
# - bson_string_t
# - bson_subtype_t
# - bson_unichar_t
# - bson_value_t
# - bson_visitor_t
# - bson_writer_t
cdef extern from "bson/bson.h":
ctypedef struct bson_t:
uint32_t flags
uint32_t len
uint8_t padding[120]

ctypedef struct bson_context_t:
pass

ctypedef struct bson_decimal128_t:
pass

ctypedef struct bson_error_t:
uint32_t domain
uint32_t code
char message[504]

ctypedef struct bson_iter_t:
pass

ctypedef struct bson_oid_t:
uint8_t bytes[12]

ctypedef struct bson_reader_t:
pass

ctypedef enum bson_type_t:
BSON_TYPE_EOD,
BSON_TYPE_DOUBLE,
BSON_TYPE_UTF8,
BSON_TYPE_DOCUMENT,
BSON_TYPE_ARRAY,
BSON_TYPE_BINARY,
BSON_TYPE_UNDEFINED,
BSON_TYPE_OID,
BSON_TYPE_BOOL,
BSON_TYPE_DATE_TIME,
BSON_TYPE_NULL,
BSON_TYPE_REGEX,
BSON_TYPE_DBPOINTER,
BSON_TYPE_CODE,
BSON_TYPE_SYMBOL,
BSON_TYPE_CODEWSCOPE,
BSON_TYPE_INT32,
BSON_TYPE_TIMESTAMP,
BSON_TYPE_INT64,
BSON_TYPE_DECIMAL128,
BSON_TYPE_MAXKEY,
BSON_TYPE_MINKEY


# bson_t API
cdef extern from "bson/bson.h":
void bson_destroy(bson_t *bson)

const uint8_t * bson_get_data(const bson_t *bson)

bint bson_has_field(const bson_t *bson, const char *key)

bint bson_init_static(bson_t *b, const uint8_t *data, size_t length)

char * bson_as_json(const bson_t *bson, size_t *length)


# bson_iter_t API
cdef extern from "bson/bson.h":
bint bson_iter_init(bson_iter_t *iter, const bson_t *bson)

bint bson_iter_init_from_data(bson_iter_t *iter, const uint8_t *data, size_t length)

bint bson_iter_next(bson_iter_t *iter)

const char * bson_iter_key(const bson_iter_t *iter)

bson_type_t bson_iter_type(const bson_iter_t *iter)

bint bson_iter_bool(const bson_iter_t *iter)

int64_t bson_iter_date_time(const bson_iter_t *iter)

# TODO: add decimal128

double bson_iter_double(const bson_iter_t *iter)

int32_t bson_iter_int32(const bson_iter_t *iter)

int64_t bson_iter_int64(const bson_iter_t *iter)


# bson_reader_t API
cdef extern from "bson/bson.h":
bson_reader_t * bson_reader_new_from_data(const uint8_t *data, size_t length)

const bson_t * bson_reader_read(bson_reader_t *reader, bint *reached_eof)


# runtime version checking API
cdef extern from "bson/bson.h":
const char * bson_get_version()
21 changes: 21 additions & 0 deletions bindings/python/pymongoarrow/libbson/version.pyx
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copyright 2021-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Cython compiler directives
# cython: language_level=3
# distutils: language=c
from pymongoarrow.libbson cimport bson_get_version


__version__ = bson_get_version().decode('utf-8')
17 changes: 17 additions & 0 deletions bindings/python/pymongoarrow/version.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Copyright 2021-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

__version__ = '0.1.0.dev0'

_MIN_LIBBSON_VERSION = '1.17.0'
30 changes: 30 additions & 0 deletions bindings/python/setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
from setuptools import find_packages, setup
from Cython.Build import cythonize

import os


def get_pymongoarrow_version():
"""Single source the version."""
version_file = os.path.realpath(os.path.join(
os.path.dirname(__file__), 'pymongoarrow', 'version.py'))
version = {}
with open(version_file) as fp:
exec(fp.read(), version)
return version['__version__']


def get_extension_modules():
modules = cythonize(['pymongoarrow/*.pyx',
'pymongoarrow/libbson/*.pyx'])
for module in modules:
module.libraries.append('bson-1.0')
return modules


setup(
name='pymongoarrow',
version=get_pymongoarrow_version(),
packages=find_packages(),
ext_modules=get_extension_modules(),
setup_requires=['cython >= 0.29'])
22 changes: 22 additions & 0 deletions bindings/python/test/test_libbson.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright 2021-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from unittest import TestCase

from pymongoarrow.libbson.version import __version__ as libbson_version


class TestLibbson(TestCase):
def test_version(self):
self.assertIsNotNone(libbson_version)
self.assertIsInstance(libbson_version, str)
22 changes: 22 additions & 0 deletions bindings/python/test/test_pymongoarrow.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright 2021-present MongoDB, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from unittest import TestCase

from pymongoarrow.version import __version__


class TestPyMongoArrow(TestCase):
def test_version(self):
self.assertIsNotNone(__version__)
self.assertIsInstance(__version__, str)