-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Replace SimpleThrottle self.history default value with deque #8074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace SimpleThrottle self.history default value with deque #8074
Conversation
rest_framework/throttling.py
Outdated
@@ -120,7 +121,7 @@ def allow_request(self, request, view): | |||
if self.key is None: | |||
return True | |||
|
|||
self.history = self.cache.get(self.key, []) | |||
self.history = self.cache.get(self.key, deque()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will constructing a deque()
only to be thrown away when hitting the cache be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @akx,
Will constructing a deque() only to be thrown away when hitting the cache be a problem?
No, the initialization way is the same as the empty list version. As you can see, there is no other way to initialize self.history
but Line 124. What we only modify is the way to maintain a FILO array (self.history
) by using collections.deque
which is the most suitable for the scenario.
Since we change the data structure of SimpleThrottleRate
that might be the most common throttle class, we must clarify it in the changelog to avoid the upgrade defect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was mostly that is it e.g. slower to initialize a deque
than an empty list? Should this explicitly be
self.history = self.cache.get(self.key, deque()) | |
self.history = self.cache.get(self.key) | |
if self.history is None: | |
self.history = deque() |
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just check the cpython/listobject.c and cpython/Modules/_collectionsmodule.c, and django-redis/cache.py since django/core/cache/backend/base.py
wants us to implement .get()
ourselves.
The initialization performance difference between list
and deque
is close I think, but seems like we will declare a deque()
in every allow_request()
no matter the key exists in cache or not, so I think the explicit way like yours will be better than the original version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple benchmark:
In [1]: timeit.timeit("[]", number=10000)
Out[1]: 0.00045701999999891996
In [2]: timeit.timeit("deque()", setup="from collections import deque", number=10000)
Out[2]: 0.000674222999975882
@@ -120,7 +121,10 @@ def allow_request(self, request, view): | |||
if self.key is None: | |||
return True | |||
|
|||
self.history = self.cache.get(self.key, []) | |||
self.history = self.cache.get(self.key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does there need to be a some sort of check that the returned value from the cache is really a deque
instance? If the cache is populated by a previous version of DRF self.history
will still be a list
instance right?
@@ -136,7 +140,7 @@ def throttle_success(self): | |||
Inserts the current request's timestamp along with the key | |||
into the cache. | |||
""" | |||
self.history.insert(0, self.now) | |||
self.history.appendleft(self.now) | |||
self.cache.set(self.key, self.history, self.duration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some cache backends for Django (e.g. Redis) support JSON serialization of cache data (instead of pickle). These might be convert a deque
to list
(or throw a TypeError
on insertion).
I'm pretty sure we've seen a similar pull request, that we merged, and then had to revert. Ah... found it: https://github.com/encode/django-rest-framework/blob/f83620dcc9e87f81ddc846c56f2ad87c2e548f8d/docs/community/release-notes.md#3124 |
The most relevant improvement here would be anything that we can do to help guide contributors away from code changing pull requests at this point, unless absolutely neccessary. |
Yes, it seemed that some tiny changes will hurt someone unexpectedly, I totally agree with your point. Thank you for your time! |
Description
To decrease the time complexity from O(N) to O(1) while adding an element at the first position of the list (
self.history
), we should usecollections.deque
with.appendleft()
for 10x performance.Demo: