Skip to content

Improve hot path performance of is_simple_callable() and get_attribute() #8318

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Ariki
Copy link

@Ariki Ariki commented Jan 4, 2022

Description

While profiling my DRF based project, I noticed that is_simple_callable() is a kind of performance bottleneck in Django REST Framework serialization because it uses functions found in inspect module which are not particularly fast. My idea is to perform a quick check using callable() builtin before calling these functions to return early in the most common case when a non-callable attribute is being processed. It gives is_simple_callable() 7.5x performance increase in this case without noticeable slowdown of other execution paths because callable() check is very fast.

The same way, a simple hasattr check in get_attribute() makes it possible to avoid a heavy isinstance check in case when object instance is not a dictionary. This results in 1.6x execution time improvement for non-dictionaries and 10-15% slowdown for dictionaries.

Copy link
Member

@tomchristie tomchristie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - my main comment here is that we ought to keep the pull request as absolutely minimal as possible.

@@ -63,22 +63,25 @@ def is_simple_callable(obj):
"""
True if the object is a callable that takes no arguments.
"""
if not callable(obj):
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be up for considering this pull request if it only included this change, as well as an example of the cases that it makes an improvement for.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, checking parameters dict for non-emptiness is rather an over-optimization at the cost of cleanliness. It makes the function perform slightly better on simple callables, but I agree that it was a bad idea to include this in a pull request.

As for swapping the order of isfunction() and ismethod() check, I believe it's harmless, and its effect is noticeable, as it's more common for a callable object attribute to be a method rather than a standalone function. But it's more about code logic than a real performance gain.

Anyway, I' ll split the two changes in is_simple_callable() and one change in get_attribute() into three pull requests for you to consider them separately.

I believe it's more common for serializer field source to be an attribute, not a callable or dictionary item, so early return from these functions will improve performance in most real-world use cases. My personal use case is a geospatial application which serves a lot of non-paginated data. Also, I use fast custom orjson based rendering, and it makes serializer performance improvements even more noticeable.

Below, I added some benchmark-like code to play with if someone is interested. It's not a real request processing benchmark but I included serializer creation and JSON serialization into the benchmark function to make it more realistic and still keep the code in a single Python file. The more attribute based fields serializer has, the more noticeable performance improvement you get.

Benchmark code
import functools
import inspect
import json
from collections.abc import Mapping
from timeit import timeit

from rest_framework import fields, serializers
from rest_framework.fields import BuiltinSignatureError


def is_simple_callable(obj):
    """
    True if the object is a callable that takes no arguments.
    """
    if not callable(obj):
        return False

    # Bail early since we cannot inspect built-in function signatures.
    if inspect.isbuiltin(obj):
        raise BuiltinSignatureError(
            'Built-in function signatures are not inspectable. '
            'Wrap the function call in a simple, pure Python function.')

    if not (inspect.isfunction(obj) or inspect.ismethod(obj) or isinstance(obj, functools.partial)):
        return False

    sig = inspect.signature(obj)
    params = sig.parameters.values()
    return all(
        param.kind == param.VAR_POSITIONAL or
        param.kind == param.VAR_KEYWORD or
        param.default != param.empty
        for param in params
    )


def get_attribute(instance, attrs):
    """
    Similar to Python's built in `getattr(instance, attr)`,
    but takes a list of nested attributes, instead of a single attribute.

    Also accepts either attribute lookup on objects or dictionary lookups.
    """
    for attr in attrs:
        try:
            if hasattr(instance, 'items') and isinstance(instance, Mapping):
                instance = instance[attr]
            else:
                instance = getattr(instance, attr)
        except ObjectDoesNotExist:
            return None
        if is_simple_callable(instance):
            try:
                instance = instance()
            except (AttributeError, KeyError) as exc:
                # If we raised an Attribute or KeyError here it'd get treated
                # as an omitted field in `Field.get_attribute()`. Instead we
                # raise a ValueError to ensure the exception is not masked.
                raise ValueError('Exception raised in callable attribute "{}"; original exception was: {}'.format(attr, exc))

    return instance


def patch_is_simple_callable():
    fields.is_simple_callable = is_simple_callable


def patch_get_attribute():
    fields.get_attribute = get_attribute


class SampleModel:
    def __init__(self, attributes=1):
        for i in range(attributes):
            setattr(self, f's{i}', f'Value {i}')

    def f(self):
        return f'{id(self)}'


class SampleSerializer(serializers.Serializer):
    def __init__(self, *args, attributes=1, callables=1, **kwargs):
        super().__init__(*args, **kwargs)
        for i in range(attributes):
            self.fields[f's{i}'] = serializers.CharField()
        for i in range(callables):
            self.fields[f'f{i}'] = serializers.ReadOnlyField(source='f')


def benchmark_code():
    ATTRIBUTE_FIELDS = 1
    CALLABLE_FIELDS = 1
    OBJECTS=100
    
    objs = [SampleModel(attributes=ATTRIBUTE_FIELDS) for i in range(OBJECTS)]
    serializer = SampleSerializer(
        objs, attributes=ATTRIBUTE_FIELDS, callables=CALLABLE_FIELDS, many=True
    )
    return json.dumps(serializer.data)


if __name__ == '__main__':
    NUMBER = 1000

    print('Original code')
    print(timeit('f()', number=NUMBER, globals=dict(f=benchmark_code)))

    print('Patched is_simple_callable()')
    patch_is_simple_callable()
    print(timeit('f()', number=NUMBER, globals=dict(f=benchmark_code)))

    print('Patched is_simple_callable() and get_attribute()')
    patch_get_attribute()
    print(timeit('f()', number=NUMBER, globals=dict(f=benchmark_code)))

# Bail early since we cannot inspect built-in function signatures.
if inspect.isbuiltin(obj):
raise BuiltinSignatureError(
'Built-in function signatures are not inspectable. '
'Wrap the function call in a simple, pure Python function.')

if not (inspect.isfunction(obj) or inspect.ismethod(obj) or isinstance(obj, functools.partial)):
if not (inspect.ismethod(obj) or inspect.isfunction(obj) or isinstance(obj, functools.partial)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest that we don't include any other modifications within the function. Or that if we do want to consider other changes, then we do so as a separate issue.

@stale
Copy link

stale bot commented Apr 16, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@tomchristie
Copy link
Member

Superseeded by #8502

@tomchristie tomchristie closed this Jun 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants