-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Improve hot path performance of is_simple_callable() and get_attribute() #8318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay - my main comment here is that we ought to keep the pull request as absolutely minimal as possible.
@@ -63,22 +63,25 @@ def is_simple_callable(obj): | |||
""" | |||
True if the object is a callable that takes no arguments. | |||
""" | |||
if not callable(obj): | |||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be up for considering this pull request if it only included this change, as well as an example of the cases that it makes an improvement for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, checking parameters dict for non-emptiness is rather an over-optimization at the cost of cleanliness. It makes the function perform slightly better on simple callables, but I agree that it was a bad idea to include this in a pull request.
As for swapping the order of isfunction()
and ismethod()
check, I believe it's harmless, and its effect is noticeable, as it's more common for a callable object attribute to be a method rather than a standalone function. But it's more about code logic than a real performance gain.
Anyway, I' ll split the two changes in is_simple_callable()
and one change in get_attribute()
into three pull requests for you to consider them separately.
I believe it's more common for serializer field source to be an attribute, not a callable or dictionary item, so early return from these functions will improve performance in most real-world use cases. My personal use case is a geospatial application which serves a lot of non-paginated data. Also, I use fast custom orjson
based rendering, and it makes serializer performance improvements even more noticeable.
Below, I added some benchmark-like code to play with if someone is interested. It's not a real request processing benchmark but I included serializer creation and JSON serialization into the benchmark function to make it more realistic and still keep the code in a single Python file. The more attribute based fields serializer has, the more noticeable performance improvement you get.
Benchmark code
import functools
import inspect
import json
from collections.abc import Mapping
from timeit import timeit
from rest_framework import fields, serializers
from rest_framework.fields import BuiltinSignatureError
def is_simple_callable(obj):
"""
True if the object is a callable that takes no arguments.
"""
if not callable(obj):
return False
# Bail early since we cannot inspect built-in function signatures.
if inspect.isbuiltin(obj):
raise BuiltinSignatureError(
'Built-in function signatures are not inspectable. '
'Wrap the function call in a simple, pure Python function.')
if not (inspect.isfunction(obj) or inspect.ismethod(obj) or isinstance(obj, functools.partial)):
return False
sig = inspect.signature(obj)
params = sig.parameters.values()
return all(
param.kind == param.VAR_POSITIONAL or
param.kind == param.VAR_KEYWORD or
param.default != param.empty
for param in params
)
def get_attribute(instance, attrs):
"""
Similar to Python's built in `getattr(instance, attr)`,
but takes a list of nested attributes, instead of a single attribute.
Also accepts either attribute lookup on objects or dictionary lookups.
"""
for attr in attrs:
try:
if hasattr(instance, 'items') and isinstance(instance, Mapping):
instance = instance[attr]
else:
instance = getattr(instance, attr)
except ObjectDoesNotExist:
return None
if is_simple_callable(instance):
try:
instance = instance()
except (AttributeError, KeyError) as exc:
# If we raised an Attribute or KeyError here it'd get treated
# as an omitted field in `Field.get_attribute()`. Instead we
# raise a ValueError to ensure the exception is not masked.
raise ValueError('Exception raised in callable attribute "{}"; original exception was: {}'.format(attr, exc))
return instance
def patch_is_simple_callable():
fields.is_simple_callable = is_simple_callable
def patch_get_attribute():
fields.get_attribute = get_attribute
class SampleModel:
def __init__(self, attributes=1):
for i in range(attributes):
setattr(self, f's{i}', f'Value {i}')
def f(self):
return f'{id(self)}'
class SampleSerializer(serializers.Serializer):
def __init__(self, *args, attributes=1, callables=1, **kwargs):
super().__init__(*args, **kwargs)
for i in range(attributes):
self.fields[f's{i}'] = serializers.CharField()
for i in range(callables):
self.fields[f'f{i}'] = serializers.ReadOnlyField(source='f')
def benchmark_code():
ATTRIBUTE_FIELDS = 1
CALLABLE_FIELDS = 1
OBJECTS=100
objs = [SampleModel(attributes=ATTRIBUTE_FIELDS) for i in range(OBJECTS)]
serializer = SampleSerializer(
objs, attributes=ATTRIBUTE_FIELDS, callables=CALLABLE_FIELDS, many=True
)
return json.dumps(serializer.data)
if __name__ == '__main__':
NUMBER = 1000
print('Original code')
print(timeit('f()', number=NUMBER, globals=dict(f=benchmark_code)))
print('Patched is_simple_callable()')
patch_is_simple_callable()
print(timeit('f()', number=NUMBER, globals=dict(f=benchmark_code)))
print('Patched is_simple_callable() and get_attribute()')
patch_get_attribute()
print(timeit('f()', number=NUMBER, globals=dict(f=benchmark_code)))
# Bail early since we cannot inspect built-in function signatures. | ||
if inspect.isbuiltin(obj): | ||
raise BuiltinSignatureError( | ||
'Built-in function signatures are not inspectable. ' | ||
'Wrap the function call in a simple, pure Python function.') | ||
|
||
if not (inspect.isfunction(obj) or inspect.ismethod(obj) or isinstance(obj, functools.partial)): | ||
if not (inspect.ismethod(obj) or inspect.isfunction(obj) or isinstance(obj, functools.partial)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest that we don't include any other modifications within the function. Or that if we do want to consider other changes, then we do so as a separate issue.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Superseeded by #8502 |
Description
While profiling my DRF based project, I noticed that
is_simple_callable()
is a kind of performance bottleneck in Django REST Framework serialization because it uses functions found ininspect
module which are not particularly fast. My idea is to perform a quick check usingcallable()
builtin before calling these functions to return early in the most common case when a non-callable attribute is being processed. It givesis_simple_callable()
7.5x performance increase in this case without noticeable slowdown of other execution paths becausecallable()
check is very fast.The same way, a simple
hasattr
check inget_attribute()
makes it possible to avoid a heavyisinstance
check in case when object instance is not a dictionary. This results in 1.6x execution time improvement for non-dictionaries and 10-15% slowdown for dictionaries.