Skip to content

Commit 736bf2a

Browse files
robsdedudefbivillethelonelyvulpes
authored
Accept more types as query parameters (#881)
Implement serialization support for * python types: tuple * pandas types: * all but period, interval, pyarrow * numpy types: * all but void, complexfloating pyarrow support was not implemented as it would either require more ifs in the recursive packing function, making it (driver's hot-path) slower for non-pyarrow use-cases. Or alternatively, transformers would have to be used making pyarrow type serialization rather slow. Co-authored-by: Florent Biville <[email protected]> Co-authored-by: Grant Lodge <[email protected]>
1 parent 0e9d09a commit 736bf2a

21 files changed

+4614
-332
lines changed
-60.3 KB
Loading

docs/source/_images/core_type_mappings.svg

Lines changed: 3484 additions & 1 deletion
Loading

docs/source/api.rst

Lines changed: 52 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1099,7 +1099,7 @@ The core types with their general mappings are listed below:
10991099
+------------------------+---------------------------------------------------------------------------------------------------------------------------+
11001100
| String | :class:`str` |
11011101
+------------------------+---------------------------------------------------------------------------------------------------------------------------+
1102-
| Bytes :sup:`[1]` | :class:`bytearray` |
1102+
| Bytes :sup:`[1]` | :class:`bytes` |
11031103
+------------------------+---------------------------------------------------------------------------------------------------------------------------+
11041104
| List | :class:`list` |
11051105
+------------------------+---------------------------------------------------------------------------------------------------------------------------+
@@ -1118,6 +1118,57 @@ The diagram below illustrates the actual mappings between the various layers, fr
11181118
:target: ./_images/core_type_mappings.svg
11191119

11201120

1121+
Extended Data Types
1122+
===================
1123+
1124+
The driver supports serializing more types (as parameters in).
1125+
However, they will have to be mapped to the existing Bolt types (see above) when they are sent to the server.
1126+
This means, the driver will never return these types in results.
1127+
1128+
When in doubt, you can test the type conversion like so::
1129+
1130+
import neo4j
1131+
1132+
1133+
with neo4j.GraphDatabase.driver(URI, auth=AUTH) as driver:
1134+
with driver.session() as session:
1135+
type_in = ("foo", "bar")
1136+
result = session.run("RETURN $x", x=type_in)
1137+
type_out = result.single()[0]
1138+
print(type(type_out))
1139+
print(type_out)
1140+
1141+
Which in this case would yield::
1142+
1143+
<class 'list'>
1144+
['foo', 'bar']
1145+
1146+
1147+
+-----------------------------------+---------------------------------+---------------------------------------+
1148+
| Parameter Type | Bolt Type | Result Type |
1149+
+===================================+=================================+=======================================+
1150+
| :class:`tuple` | List | :class:`list` |
1151+
+-----------------------------------+---------------------------------+---------------------------------------+
1152+
| :class:`bytearray` | Bytes | :class:`bytes` |
1153+
+-----------------------------------+---------------------------------+---------------------------------------+
1154+
| numpy\ :sup:`[2]` ``ndarray`` | (nested) List | (nested) :class:`list` |
1155+
+-----------------------------------+---------------------------------+---------------------------------------+
1156+
| pandas\ :sup:`[3]` ``DataFrame`` | Map[str, List[_]] :sup:`[4]` | :class:`dict` |
1157+
+-----------------------------------+---------------------------------+---------------------------------------+
1158+
| pandas ``Series`` | List | :class:`list` |
1159+
+-----------------------------------+---------------------------------+---------------------------------------+
1160+
| pandas ``Array`` | List | :class:`list` |
1161+
+-----------------------------------+---------------------------------+---------------------------------------+
1162+
1163+
.. Note::
1164+
1165+
2. ``void`` and ``complexfloating`` typed numpy ``ndarray``\s are not supported.
1166+
3. ``Period``, ``Interval``, and ``pyarrow`` pandas types are not supported.
1167+
4. A pandas ``DataFrame`` will be serialized as Map with the column names mapping to the column values (as Lists).
1168+
Just like with ``dict`` objects, the column names need to be :class:`str` objects.
1169+
1170+
1171+
11211172
****************
11221173
Graph Data Types
11231174
****************

pyproject.toml

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,10 +45,17 @@ dynamic = ["version", "readme"]
4545
Homepage = "https://github.com/neo4j/neo4j-python-driver"
4646

4747
[project.optional-dependencies]
48-
pandas = ["pandas>=1.0.0"]
48+
numpy = ["numpy >= 1.7.0, < 2.0.0"]
49+
pandas = [
50+
"pandas >= 1.1.0, < 2.0.0",
51+
"numpy >= 1.7.0, < 2.0.0",
52+
]
4953

5054
[build-system]
51-
requires = ["setuptools~=65.6", "tomlkit~=0.11.6"]
55+
requires = [
56+
"setuptools~=65.6",
57+
"tomlkit~=0.11.6",
58+
]
5259
build-backend = "setuptools.build_meta"
5360

5461
# still in beta

requirements-dev.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@ tomlkit~=0.11.6
1818
# needed for running tests
1919
coverage[toml]>=5.5
2020
mock>=4.0.3
21+
numpy>=1.7.0
2122
pandas>=1.0.0
23+
pyarrow>=1.0.0
2224
pytest>=6.2.5
2325
pytest-asyncio>=0.16.0
2426
pytest-benchmark>=3.4.1

src/neo4j/_codec/hydration/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,15 @@
1717

1818
from ._common import (
1919
BrokenHydrationObject,
20+
DehydrationHooks,
2021
HydrationScope,
2122
)
2223
from ._interface import HydrationHandlerABC
2324

2425

2526
__all__ = [
2627
"BrokenHydrationObject",
28+
"DehydrationHooks",
2729
"HydrationHandlerABC",
2830
"HydrationScope",
2931
]

src/neo4j/_codec/hydration/_common.py

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,51 @@
1616
# limitations under the License.
1717

1818

19+
import typing as t
1920
from copy import copy
21+
from dataclasses import dataclass
2022

2123
from ...graph import Graph
2224
from ..packstream import Structure
2325

2426

27+
@dataclass
28+
class DehydrationHooks:
29+
exact_types: t.Dict[t.Type, t.Callable[[t.Any], t.Any]]
30+
subtypes: t.Dict[t.Type, t.Callable[[t.Any], t.Any]]
31+
32+
def update(self, exact_types=None, subtypes=None):
33+
exact_types = exact_types or {}
34+
subtypes = subtypes or {}
35+
self.exact_types.update(exact_types)
36+
self.subtypes.update(subtypes)
37+
38+
def extend(self, exact_types=None, subtypes=None):
39+
exact_types = exact_types or {}
40+
subtypes = subtypes or {}
41+
return DehydrationHooks(
42+
exact_types={**self.exact_types, **exact_types},
43+
subtypes={**self.subtypes, **subtypes},
44+
)
45+
46+
def get_transformer(self, item):
47+
type_ = type(item)
48+
transformer = self.exact_types.get(type_)
49+
if transformer is not None:
50+
return transformer
51+
transformer = next(
52+
(
53+
f
54+
for super_type, f in self.subtypes.items()
55+
if isinstance(item, super_type)
56+
),
57+
None,
58+
)
59+
if transformer is not None:
60+
return transformer
61+
return None
62+
63+
2564
class BrokenHydrationObject:
2665
"""
2766
Represents an object from the server, not understood by the driver.
@@ -68,7 +107,7 @@ def __init__(self, hydration_handler, graph_hydrator):
68107
list: self._hydrate_list,
69108
dict: self._hydrate_dict,
70109
}
71-
self.dehydration_hooks = hydration_handler.dehydration_functions
110+
self.dehydration_hooks = hydration_handler.dehydration_hooks
72111

73112
def _hydrate_structure(self, value):
74113
f = self._struct_hydration_functions.get(value.tag)

src/neo4j/_codec/hydration/_interface/__init__.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,14 @@
1818

1919
import abc
2020

21+
from .._common import DehydrationHooks
22+
2123

2224
class HydrationHandlerABC(abc.ABC):
2325
def __init__(self):
2426
self.struct_hydration_functions = {}
25-
self.dehydration_functions = {}
27+
self.dehydration_hooks = DehydrationHooks(exact_types={},
28+
subtypes={})
2629

2730
@abc.abstractmethod
2831
def new_hydration_scope(self):

src/neo4j/_codec/hydration/v1/hydration_handler.py

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@
2323
timedelta,
2424
)
2525

26+
from ...._optional_deps import (
27+
np,
28+
pd,
29+
)
2630
from ....graph import (
2731
Graph,
2832
Node,
@@ -159,8 +163,7 @@ def __init__(self):
159163
b"d": temporal.hydrate_datetime, # no time zone
160164
b"E": temporal.hydrate_duration,
161165
}
162-
self.dehydration_functions = {
163-
**self.dehydration_functions,
166+
self.dehydration_hooks.update(exact_types={
164167
Point: spatial.dehydrate_point,
165168
CartesianPoint: spatial.dehydrate_point,
166169
WGS84Point: spatial.dehydrate_point,
@@ -172,7 +175,19 @@ def __init__(self):
172175
datetime: temporal.dehydrate_datetime,
173176
Duration: temporal.dehydrate_duration,
174177
timedelta: temporal.dehydrate_timedelta,
175-
}
178+
})
179+
if np is not None:
180+
self.dehydration_hooks.update(exact_types={
181+
np.datetime64: temporal.dehydrate_np_datetime,
182+
np.timedelta64: temporal.dehydrate_np_timedelta,
183+
})
184+
if pd is not None:
185+
self.dehydration_hooks.update(exact_types={
186+
pd.Timestamp: temporal.dehydrate_pandas_datetime,
187+
pd.Timedelta: temporal.dehydrate_pandas_timedelta,
188+
type(pd.NaT): lambda _: None,
189+
})
190+
176191

177192
def patch_utc(self):
178193
from ..v2 import temporal as temporal_v2
@@ -186,10 +201,18 @@ def patch_utc(self):
186201
b"i": temporal_v2.hydrate_datetime,
187202
})
188203

189-
self.dehydration_functions.update({
204+
self.dehydration_hooks.update(exact_types={
190205
DateTime: temporal_v2.dehydrate_datetime,
191206
datetime: temporal_v2.dehydrate_datetime,
192207
})
208+
if np is not None:
209+
self.dehydration_hooks.update(exact_types={
210+
np.datetime64: temporal_v2.dehydrate_np_datetime,
211+
})
212+
if pd is not None:
213+
self.dehydration_hooks.update(exact_types={
214+
pd.Timestamp: temporal_v2.dehydrate_pandas_datetime,
215+
})
193216

194217
def new_hydration_scope(self):
195218
self._created_scope = True

src/neo4j/_codec/hydration/v1/temporal.py

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,10 +22,17 @@
2222
timedelta,
2323
)
2424

25+
from ...._optional_deps import (
26+
np,
27+
pd,
28+
)
2529
from ....time import (
2630
Date,
2731
DateTime,
2832
Duration,
33+
MAX_YEAR,
34+
MIN_YEAR,
35+
NANO_SECONDS,
2936
Time,
3037
)
3138
from ...packstream import Structure
@@ -171,6 +178,50 @@ def seconds_and_nanoseconds(dt):
171178
int(tz.utcoffset(value).total_seconds()))
172179

173180

181+
if np is not None:
182+
def dehydrate_np_datetime(value):
183+
""" Dehydrator for `numpy.datetime64` values.
184+
185+
:param value:
186+
:type value: numpy.datetime64
187+
:returns:
188+
"""
189+
if np.isnat(value):
190+
return None
191+
year = value.astype("datetime64[Y]").astype(int) + 1970
192+
if not 0 < year <= 9999:
193+
# while we could encode years outside the range, they would fail
194+
# when retrieved from the database.
195+
raise ValueError(f"Year out of range ({MIN_YEAR:d}..{MAX_YEAR:d}) "
196+
f"found {year}")
197+
seconds = value.astype(np.dtype("datetime64[s]")).astype(int)
198+
nanoseconds = (value.astype(np.dtype("datetime64[ns]")).astype(int)
199+
% NANO_SECONDS)
200+
return Structure(b"d", seconds, nanoseconds)
201+
202+
203+
if pd is not None:
204+
def dehydrate_pandas_datetime(value):
205+
""" Dehydrator for `pandas.Timestamp` values.
206+
207+
:param value:
208+
:type value: pandas.Timestamp
209+
:returns:
210+
"""
211+
return dehydrate_datetime(
212+
DateTime(
213+
value.year,
214+
value.month,
215+
value.day,
216+
value.hour,
217+
value.minute,
218+
value.second,
219+
value.microsecond * 1000 + value.nanosecond,
220+
value.tzinfo,
221+
)
222+
)
223+
224+
174225
def hydrate_duration(months, days, seconds, nanoseconds):
175226
""" Hydrator for `Duration` values.
176227
@@ -205,3 +256,50 @@ def dehydrate_timedelta(value):
205256
seconds = value.seconds
206257
nanoseconds = 1000 * value.microseconds
207258
return Structure(b"E", months, days, seconds, nanoseconds)
259+
260+
261+
if np is not None:
262+
_NUMPY_DURATION_UNITS = {
263+
"Y": "years",
264+
"M": "months",
265+
"W": "weeks",
266+
"D": "days",
267+
"h": "hours",
268+
"m": "minutes",
269+
"s": "seconds",
270+
"ms": "milliseconds",
271+
"us": "microseconds",
272+
"ns": "nanoseconds",
273+
}
274+
275+
def dehydrate_np_timedelta(value):
276+
""" Dehydrator for `numpy.timedelta64` values.
277+
278+
:param value:
279+
:type value: numpy.timedelta64
280+
:returns:
281+
"""
282+
if np.isnat(value):
283+
return None
284+
unit, step_size = np.datetime_data(value)
285+
numer = int(value.astype(int))
286+
# raise RuntimeError((type(numer), type(step_size)))
287+
kwarg = _NUMPY_DURATION_UNITS.get(unit)
288+
if kwarg is not None:
289+
return dehydrate_duration(Duration(**{kwarg: numer * step_size}))
290+
return dehydrate_duration(Duration(
291+
nanoseconds=value.astype("timedelta64[ns]").astype(int)
292+
))
293+
294+
295+
if pd is not None:
296+
def dehydrate_pandas_timedelta(value):
297+
""" Dehydrator for `pandas.Timedelta` values.
298+
299+
:param value:
300+
:type value: pandas.Timedelta
301+
:returns:
302+
"""
303+
return dehydrate_duration(Duration(
304+
nanoseconds=value.value
305+
))

0 commit comments

Comments
 (0)