-
-
Notifications
You must be signed in to change notification settings - Fork 32.3k
bpo-28638: speed up namedtuple creation by avoiding exec #2736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Creating a namedtuple is relatively slow because it uses exec(). This commit reduces the exec()'ed code, but still uses exec() for creating the __new__ method. I don't know of a way to avoid using exec() for __new__ beyond manipulating bytecode directly. However, avoiding exec() for creating the class itself still yields a significant speedup. In an unscientific benchmark I ran, creating 1000 namedtuple classes now takes about 0.14 s instead of 0.44 s. There is one backward compatibility break: namedtuples no longer have a _source attribute, because we no longer exec() their source. I kept the verbose=True argument around for compatibility, but it now does nothing.
@JelleZijlstra, thanks for your PR! By analyzing the history of the files in this pull request, we identified @rhettinger, @serhiy-storchaka and @vsajip to be potential reviewers. |
But why can't we keep a |
Lib/collections/__init__.py
Outdated
raise TypeError('Expected %d arguments, got %d' % (num_fields, len(result))) | ||
return result | ||
|
||
_make.__func__.__doc__ = 'Make a new {typename} object from a sequence or iterable'.format(typename=typename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line should be wrapped (as you did below for _replace
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Lib/collections/__init__.py
Outdated
@classmethod | ||
def _make(cls, iterable, new=tuple.__new__, len=len): | ||
result = new(cls, iterable) | ||
if len(result) != num_fields: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will keep the closure alive, maybe make num_fileds
a (private) attribute of the class and define this function outside of namedtuple
? (TBH however, I have no idea how efficient this will be, but intuitively this should be more efficient than with nested function at least in terms of memory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably @serhiy-storchaka could say whether this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to add cls._num_fields
to avoid adding another function call here.
Lib/collections/__init__.py
Outdated
_make.__func__.__doc__ = 'Make a new {typename} object from a sequence or iterable'.format(typename=typename) | ||
|
||
def _replace(_self, **kwds): | ||
result = _self._make(map(kwds.pop, field_names, _self)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same situation with closure here and in __repr__
, maybe use _self._fields
instead of field_names
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This looks pretty good so far. I would like to layer in lazy generation of "_source" and to keep the "verbose" option functional. I think is possible to do this all in a way that it transparent to the user, that keeps the current API fully intact and only modestly increases the complexity of the code |
Thanks! I'm going to first address Ivan's comments related to closures, then work on bringing back |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks good now. Maybe Raymond will have more comments. Also it would be good to see detailed benchmarks (space and time) for this particular implementation vs the original one.
exec(class_definition, namespace) | ||
result = namespace[typename] | ||
result._source = class_definition | ||
arg_list = repr(tuple(field_names)).replace("'", "")[1:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel ', '.join(field_names)
is pythonic code.
Is this line different from it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only difference is in the case of 1-element tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the trailing comma desirable in the 1-element tuple case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise it's no longer a tuple.
Quick and dirty bench on pypy3.5-5.8
original:
patched:
|
Same bench on Python 3.6.2 on macOS original:
patched:
|
And memory usage on CPython 3.6.2 original: patched: |
Doc/library/collections.rst
Outdated
|
||
.. versionadded:: 3.3 | ||
|
||
.. versionchanged:: 3.7 | ||
``_source`` is no longer used to created the named tuple class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this sentence a bit confusing to understand. Something like "_source is no longer the actual named tuple class implementation, but an equivalent implementation" would be clearer to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing this to ``_source`` is no longer used to create the named tuple class implementation, but contains an equivalent implementation.
exec(class_definition, namespace) | ||
result = namespace[typename] | ||
result._source = class_definition | ||
arg_list = repr(tuple(field_names)).replace("'", "")[1:-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only difference is in the case of 1-element tuple.
Lib/collections/__init__.py
Outdated
@classmethod | ||
def _make(cls, iterable, new=tuple.__new__, len=len): | ||
result = new(cls, iterable) | ||
if len(result) != cls._num_fields: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use just len(cls._fields)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to introduce another function call. That might be premature optimization though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may be able to add num_fields=num_fields
argument like len=len
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll go with @methane's suggestion.
Lib/collections/__init__.py
Outdated
_repr_template = '{name}=%r' | ||
_new_template = ''' | ||
def __new__(_cls, {arg_list}): | ||
'Create new instance of {typename}({arg_list})' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be faster to set the __doc__
attribute explicitly rather than compile it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Lib/collections/__init__.py
Outdated
namespace = {'_tuple': tuple} | ||
new_source = _new_template.format(typename=typename, arg_list=arg_list) | ||
exec(new_source, namespace) | ||
__new__ = namespace['__new__'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set __qualname__
and __module__
attributes of the __new__
method and other methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
General question: do we want to use f-strings here rather than explicit format() calls? |
f-strings should be faster, so that probably we should use them where possible. |
A downside of f-strings would be that they would make it harder to port this implementation to e.g. pypy, which doesn't support 3.6 yet. I can check how much of a difference it makes in benchmarks. |
PyPy3.5 supports f-string. |
If we want performance about formatting string, why not use '%'-format? It seems faster than two others. |
Because original code uses
Why do you think so? It seems f-string is fast in most cases.
|
Notes: - fasternt.py is a copy of collections/__init__.py as of python/cpython#2736 - you need to install perf==0.9.6 to run the benchmark, the latest version doesn't work - run the benchmark with ./bench in the prof directory. In summary, fasternt is ~4x faster than the stdlib at creating namedtuple classes and has no effect on instantiation or attribute access. cnamedtuple is 40x faster at class creation and ~30% faster at instantiation and attribute access.
Lib/collections/__init__.py
Outdated
_new_template = ''' | ||
def __new__(_cls, {arg_list}): | ||
'Create new instance of {typename}({arg_list})' | ||
return _tuple.__new__(_cls, ({arg_list})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of _tuple.__new__
, how about put namespace['_tuple_new'] = tuple.__new__
and call _tuple_new
?
It makes eval
and instantiation bit (about 3%) faster.
this pull request:
$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple' -- 'namedtuple("Foo", "foo bar baz")'
.....................
Mean +- std dev: 90.1 us +- 3.3 us
$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple; Foo=namedtuple("Foo", "foo bar baz")' -- 'Foo(1,2,3)'
.....................
Mean +- std dev: 503 ns +- 15 ns
_tuple_new
version:
$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple' -- 'namedtuple("Foo", "foo bar baz")'
.....................
Mean +- std dev: 87.1 us +- 2.7 us
$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple; Foo=namedtuple("Foo", "foo bar baz")' -- 'Foo(1,2,3)'
.....................
Mean +- std dev: 485 ns +- 21 ns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, making that change.
@JelleZijlstra would you update this pull request? |
Sorry, I've been busy. I'll try to find some time for it today. |
I think I've addressed all comments since my last push. Let me know if you'd like to see any other changes. Another open question: Should we use f-strings? They'll likely be a little faster, but may make it harder to port this code to other Python versions and implementations. |
This patch will not be backported even Python 3.6 which supports f-string. So please don't worry about it. |
Yes, I also think f-strings should be used here. |
Per past policy, optimizations don't get backported. I'll look at this patch more during the sprints next week. |
__new__ = namespace['__new__'] | ||
__new__.__doc__ = f'Create new instance of {typename}({arg_list})' | ||
|
||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be cleaner to move this down to the class_namespace
, to avoid having to use .__func__
|
||
def __getnewargs__(self): | ||
'Return self as a plain tuple. Used by copy and pickle.' | ||
return tuple(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These three functions without formatted docstrings don't need to be redeclared for each namedtuple, do they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that doesn't work now because I'm setting __qualname__
and __module__
on them.
_repr_template = '{name}=%r' | ||
_new_template = ''' | ||
def __new__(_cls, {arg_list}): | ||
return _tuple_new(_cls, ({arg_list})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't that be better handled by inserting the trailing comma here?
new_source = _new_template.format(arg_list=arg_list) | ||
exec(new_source, namespace) | ||
__new__ = namespace['__new__'] | ||
__new__.__doc__ = f'Create new instance of {typename}({arg_list})' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise you get a strange trailing comma in this docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request is about performance. I don't think changing much is better.
Current pull request follows current code.
>>> foo = collections.namedtuple("foo", "a")
>>> print(foo._source)
...
def __new__(_cls, a,):
'Create new instance of foo(a,)'
return _tuple.__new__(_cls, (a,))
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although the current pull request doesn't produce exactly the same code any more anyway, since _tuple.__new__
is now _tuple_new
, so there's no particular argument against changing it other than "it doesn't need to be part of this pr" (which I agree is true)
@@ -0,0 +1,2 @@ | |||
Speed up namedtuple class creation by 4x by avoiding usage of exec(). Patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be read as exec was eliminated, so perhaps '...by reducing usage of exec().'?
The main differences are:
|
Creating a namedtuple is relatively slow because it uses exec().
This commit reduces the exec()'ed code, but still uses exec() for
creating the
__new__
method. I don't know of a way to avoid usingexec() for
__new__
beyond manipulating bytecode directly.However, avoiding exec() for creating the class itself still yields
a significant speedup. In an unscientific benchmark I ran, creating
1000 namedtuple classes now takes about 0.14 s instead of 0.44 s.
There is one backward compatibility break: namedtuples no longer have
a _source attribute, because we no longer exec() their source. I kept
the verbose=True argument around for compatibility, but it now does
nothing.
cc @sixolet @methane
https://bugs.python.org/issue28638