bpo-28638: speed up namedtuple creation by avoiding exec #2736

JelleZijlstra · 2017-07-16T22:10:55Z

Creating a namedtuple is relatively slow because it uses exec().
This commit reduces the exec()'ed code, but still uses exec() for
creating the __new__ method. I don't know of a way to avoid using
exec() for __new__ beyond manipulating bytecode directly.

However, avoiding exec() for creating the class itself still yields
a significant speedup. In an unscientific benchmark I ran, creating
1000 namedtuple classes now takes about 0.14 s instead of 0.44 s.

There is one backward compatibility break: namedtuples no longer have
a _source attribute, because we no longer exec() their source. I kept
the verbose=True argument around for compatibility, but it now does
nothing.

cc @sixolet @methane

https://bugs.python.org/issue28638

Creating a namedtuple is relatively slow because it uses exec(). This commit reduces the exec()'ed code, but still uses exec() for creating the __new__ method. I don't know of a way to avoid using exec() for __new__ beyond manipulating bytecode directly. However, avoiding exec() for creating the class itself still yields a significant speedup. In an unscientific benchmark I ran, creating 1000 namedtuple classes now takes about 0.14 s instead of 0.44 s. There is one backward compatibility break: namedtuples no longer have a _source attribute, because we no longer exec() their source. I kept the verbose=True argument around for compatibility, but it now does nothing.

mention-bot · 2017-07-16T22:10:58Z

@JelleZijlstra, thanks for your PR! By analyzing the history of the files in this pull request, we identified @rhettinger, @serhiy-storchaka and @vsajip to be potential reviewers.

ilevkivskyi · 2017-07-17T13:09:49Z

But why can't we keep a _source just for backward compatibility, containing a practical equivalent of the proposed no-exec version? Maybe make it a property to avoid some time overhead of large string formatting for users who don't use _source.

ilevkivskyi · 2017-07-17T13:13:54Z

Lib/collections/__init__.py

+            raise TypeError('Expected %d arguments, got %d' % (num_fields, len(result)))
+        return result
+
+    _make.__func__.__doc__ = 'Make a new {typename} object from a sequence or iterable'.format(typename=typename)


I think this line should be wrapped (as you did below for _replace).

ilevkivskyi · 2017-07-17T13:19:13Z

Lib/collections/__init__.py

+    @classmethod
+    def _make(cls, iterable, new=tuple.__new__, len=len):
+        result = new(cls, iterable)
+        if len(result) != num_fields:


This will keep the closure alive, maybe make num_fileds a (private) attribute of the class and define this function outside of namedtuple? (TBH however, I have no idea how efficient this will be, but intuitively this should be more efficient than with nested function at least in terms of memory).

Probably @serhiy-storchaka could say whether this makes sense.

In my POC implementation (see bpo-28638) I used len(cls._fields).

But bpo-28638 is closed by Raymond. I suggest to close this PR.

Going to add cls._num_fields to avoid adding another function call here.

ilevkivskyi · 2017-07-17T13:22:39Z

Lib/collections/__init__.py

+    _make.__func__.__doc__ = 'Make a new {typename} object from a sequence or iterable'.format(typename=typename)
+
+    def _replace(_self, **kwds):
+        result = _self._make(map(kwds.pop, field_names, _self))


Same situation with closure here and in __repr__, maybe use _self._fields instead of field_names.

rhettinger · 2017-07-18T00:54:19Z

This looks pretty good so far. I would like to layer in lazy generation of "_source" and to keep the "verbose" option functional. I think is possible to do this all in a way that it transparent to the user, that keeps the current API fully intact and only modestly increases the complexity of the code

JelleZijlstra · 2017-07-18T02:36:24Z

Thanks! I'm going to first address Ivan's comments related to closures, then work on bringing back _source.

ilevkivskyi

This generally looks good now. Maybe Raymond will have more comments. Also it would be good to see detailed benchmarks (space and time) for this particular implementation vs the original one.

methane · 2017-07-18T09:11:30Z

Lib/collections/__init__.py

-    exec(class_definition, namespace)
-    result = namespace[typename]
-    result._source = class_definition
+    arg_list = repr(tuple(field_names)).replace("'", "")[1:-1]


I feel ', '.join(field_names) is pythonic code.
Is this line different from it?

The only difference is in the case of 1-element tuple.

Why is the trailing comma desirable in the 1-element tuple case?

Otherwise it's no longer a tuple.

methane · 2017-07-18T09:51:06Z

Quick and dirty bench on pypy3.5-5.8

# x.py
from collections import namedtuple
nts = [namedtuple(f"Foo{i}", "foo,bar,baz") for i in range(10000)]

original:

$ time pyenv/versions/pypy3.5-5.8.0/bin/pypy x.py
real	0m7.540s
user	0m7.448s
sys	0m0.092s

$ time pyenv/versions/pypy3.5-5.8.0/bin/pypy x.py
real	0m7.529s
user	0m7.448s
sys	0m0.076s

patched:

$ time pyenv/versions/pypy3.5-5.8.0/bin/pypy x.py
real	0m2.010s
user	0m1.988s
sys	0m0.020s

$ time pyenv/versions/pypy3.5-5.8.0/bin/pypy x.py
real	0m2.016s
user	0m1.980s
sys	0m0.032s

methane · 2017-07-18T09:53:10Z

Same bench on Python 3.6.2 on macOS

original:

$ time ./pyenv/versions/3.6.2/bin/python3 x.py
real	0m4.469s
user	0m4.349s
sys	0m0.105s

$ time ./pyenv/versions/3.6.2/bin/python3 x.py
real	0m4.510s
user	0m4.374s
sys	0m0.116s

patched:

$ time ./pyenv/versions/3.6.2/bin/python3 x.py
real	0m1.131s
user	0m1.063s
sys	0m0.059s

$ time ./pyenv/versions/3.6.2/bin/python3 x.py
real	0m1.123s
user	0m1.055s
sys	0m0.060s

methane · 2017-07-18T09:56:01Z

And memory usage on CPython 3.6.2

original:
270 arenas * 262144 bytes/arena = 70,778,880

patched:
144 arenas * 262144 bytes/arena = 37,748,736

pitrou · 2017-07-18T10:14:22Z

Doc/library/collections.rst


    .. versionadded:: 3.3

+    .. versionchanged:: 3.7
+       ``_source`` is no longer used to created the named tuple class.


I find this sentence a bit confusing to understand. Something like "_source is no longer the actual named tuple class implementation, but an equivalent implementation" would be clearer to me.

Changing this to ``_source`` is no longer used to create the named tuple class implementation, but contains an equivalent implementation.

serhiy-storchaka · 2017-07-18T10:28:57Z

Lib/collections/__init__.py

-    exec(class_definition, namespace)
-    result = namespace[typename]
-    result._source = class_definition
+    arg_list = repr(tuple(field_names)).replace("'", "")[1:-1]


The only difference is in the case of 1-element tuple.

serhiy-storchaka · 2017-07-18T10:30:11Z

Lib/collections/__init__.py

+    @classmethod
+    def _make(cls, iterable, new=tuple.__new__, len=len):
+        result = new(cls, iterable)
+        if len(result) != cls._num_fields:


Why not use just len(cls._fields)?

I didn't want to introduce another function call. That might be premature optimization though.

You may be able to add num_fields=num_fields argument like len=len.

I'll go with @methane's suggestion.

serhiy-storchaka · 2017-07-18T10:34:35Z

Lib/collections/__init__.py

+_repr_template = '{name}=%r'
+_new_template = '''
+def __new__(_cls, {arg_list}):
+    'Create new instance of {typename}({arg_list})'


It may be faster to set the __doc__ attribute explicitly rather than compile it.

serhiy-storchaka · 2017-07-18T10:44:08Z

Lib/collections/__init__.py

+    namespace = {'_tuple': tuple}
+    new_source = _new_template.format(typename=typename, arg_list=arg_list)
+    exec(new_source, namespace)
+    __new__ = namespace['__new__']


Set __qualname__ and __module__ attributes of the __new__ method and other methods.

pitrou · 2017-07-18T11:28:49Z

General question: do we want to use f-strings here rather than explicit format() calls?

ilevkivskyi · 2017-07-18T11:36:12Z

General question: do we want to use f-strings here rather than explicit format() calls?

f-strings should be faster, so that probably we should use them where possible.

JelleZijlstra · 2017-07-19T03:55:03Z

A downside of f-strings would be that they would make it harder to port this implementation to e.g. pypy, which doesn't support 3.6 yet. I can check how much of a difference it makes in benchmarks.

methane · 2017-07-19T04:14:32Z

PyPy3.5 supports f-string.
But I don't know about other implementations (e.g. MicroPython support Python 3.5).

mlouielu · 2017-07-19T04:23:00Z

If we want performance about formatting string, why not use '%'-format? It seems faster than two others.

methane · 2017-07-19T04:34:11Z

why not use '%'-format?

Because original code uses "...{typename}...".format(typename=typename) style.
Changing it to f-string (f"...{typename}...") is very easy and straight.

It seems faster than two others.

Why do you think so? It seems f-string is fast in most cases.

$ python3 -m perf timeit -s 'bar="bar"' -- '"foo{bar}baz".format(bar=bar)'
.....................
Mean +- std dev: 524 ns +- 14 ns

$ python3 -m perf timeit -s 'bar="bar"' -- 'f"foo{bar}baz"'
.....................
Mean +- std dev: 101 ns +- 3 ns

$ python3 -m perf timeit -s 'bar="bar"' -- '"foo%sbaz" % (bar,)'
.....................
Mean +- std dev: 237 ns +- 8 ns

mlouielu · 2017-07-19T05:18:27Z

@methane I take a wrong mesurement about this, I'm testing format 10 items in the string, not only one item.

Notes: - fasternt.py is a copy of collections/__init__.py as of python/cpython#2736 - you need to install perf==0.9.6 to run the benchmark, the latest version doesn't work - run the benchmark with ./bench in the prof directory. In summary, fasternt is ~4x faster than the stdlib at creating namedtuple classes and has no effect on instantiation or attribute access. cnamedtuple is 40x faster at class creation and ~30% faster at instantiation and attribute access.

methane · 2017-07-19T09:15:48Z

Lib/collections/__init__.py

+_new_template = '''
+def __new__(_cls, {arg_list}):
+    'Create new instance of {typename}({arg_list})'
+    return _tuple.__new__(_cls, ({arg_list}))


Instead of _tuple.__new__, how about put namespace['_tuple_new'] = tuple.__new__ and call _tuple_new?
It makes eval and instantiation bit (about 3%) faster.

this pull request:

$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple' -- 'namedtuple("Foo", "foo bar baz")' ..................... Mean +- std dev: 90.1 us +- 3.3 us $ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple; Foo=namedtuple("Foo", "foo bar baz")' -- 'Foo(1,2,3)' ..................... Mean +- std dev: 503 ns +- 15 ns

_tuple_new version:

$ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple' -- 'namedtuple("Foo", "foo bar baz")' ..................... Mean +- std dev: 87.1 us +- 2.7 us $ ./pyenv/versions/3.6.2/bin/python -m perf timeit -s 'from collections import namedtuple; Foo=namedtuple("Foo", "foo bar baz")' -- 'Foo(1,2,3)' ..................... Mean +- std dev: 485 ns +- 21 ns

Thanks, making that change.

methane · 2017-08-27T15:39:20Z

@JelleZijlstra would you update this pull request?

JelleZijlstra · 2017-08-27T15:47:31Z

Sorry, I've been busy. I'll try to find some time for it today.

JelleZijlstra · 2017-08-27T21:05:37Z

I think I've addressed all comments since my last push. Let me know if you'd like to see any other changes.

Another open question: Should we use f-strings? They'll likely be a little faster, but may make it harder to port this code to other Python versions and implementations.

methane · 2017-08-28T04:59:26Z

Another open question: Should we use f-strings? They'll likely be a little faster, but may make it harder to port this code to other Python versions and implementations.

This patch will not be backported even Python 3.6 which supports f-string.
There are no chance to backport this to Python 3.5.
Additionally, PyPy3.5 supports f-string.

So please don't worry about it.

ilevkivskyi · 2017-08-28T07:20:30Z

Yes, I also think f-strings should be used here.

rhettinger · 2017-08-28T09:36:27Z

Per past policy, optimizations don't get backported. I'll look at this patch more during the sprints next week.

eric-wieser · 2017-09-02T04:02:52Z

Lib/collections/__init__.py

+    __new__ = namespace['__new__']
+    __new__.__doc__ = f'Create new instance of {typename}({arg_list})'
+
+    @classmethod


Might be cleaner to move this down to the class_namespace, to avoid having to use .__func__

eric-wieser · 2017-09-02T04:11:58Z

Lib/collections/__init__.py

+
+    def __getnewargs__(self):
+        'Return self as a plain tuple.  Used by copy and pickle.'
+        return tuple(self)


These three functions without formatted docstrings don't need to be redeclared for each namedtuple, do they?

I think that doesn't work now because I'm setting __qualname__ and __module__ on them.

eric-wieser · 2017-09-02T05:52:21Z

Lib/collections/__init__.py

+_repr_template = '{name}=%r'
+_new_template = '''
+def __new__(_cls, {arg_list}):
+    return _tuple_new(_cls, ({arg_list}))


Wouldn't that be better handled by inserting the trailing comma here?

eric-wieser · 2017-09-02T05:52:51Z

Lib/collections/__init__.py

+    new_source = _new_template.format(arg_list=arg_list)
+    exec(new_source, namespace)
+    __new__ = namespace['__new__']
+    __new__.__doc__ = f'Create new instance of {typename}({arg_list})'


Otherwise you get a strange trailing comma in this docstring

This pull request is about performance. I don't think changing much is better.
Current pull request follows current code.

>>> foo = collections.namedtuple("foo", "a") >>> print(foo._source) ... def __new__(_cls, a,): 'Create new instance of foo(a,)' return _tuple.__new__(_cls, (a,)) ...

Although the current pull request doesn't produce exactly the same code any more anyway, since _tuple.__new__ is now _tuple_new, so there's no particular argument against changing it other than "it doesn't need to be part of this pr" (which I agree is true)

emmatyping · 2017-09-03T17:10:16Z

Misc/NEWS.d/next/Library/2017-08-27-14-03-50.bpo-28638.W4CQxG.rst

@@ -0,0 +1,2 @@
+Speed up namedtuple class creation by 4x by avoiding usage of exec(). Patch


This could be read as exec was eliminated, so perhaps '...by reducing usage of exec().'?

rhettinger · 2017-09-08T21:03:07Z

After further discussion with Guido, I've decided to eliminate the _source attribute and verbose parameter.
I've written my own patch from scratch and have mostly sync'd-up with this patch. Let's please move the discussion to bpo-28638: Optimize namedtuple() creation time by minimizing use of exec() #3454

The main differences are:

Dropped _source and verbose
Better use of closure variables as a fast way to pass fixed values into the methods
Use dict() notation versus {k:v} style
Early conversion of field_names to a tuple so this gets done just once
Re-use itemgetter objects and intern the docstrings
Better comments of sections and note where exec interns for us
Compact single-line template for __new__

the-knights-who-say-ni added the CLA signed label Jul 16, 2017

pitrou added the performance Performance or resource usage label Jul 17, 2017

ilevkivskyi reviewed Jul 17, 2017

View reviewed changes

rhettinger self-assigned this Jul 18, 2017

JelleZijlstra added 3 commits July 17, 2017 19:44

avoid using closure variables and break up long lines

c135a36

bring back _source

77be15f

update _source documentation

649bb2e

ilevkivskyi approved these changes Jul 18, 2017

View reviewed changes

methane reviewed Jul 18, 2017

View reviewed changes

pitrou reviewed Jul 18, 2017

View reviewed changes

serhiy-storchaka reviewed Jul 18, 2017

View reviewed changes

methane reviewed Jul 19, 2017

View reviewed changes

micro-optimizations

da03fdb

bedevere-bot added the awaiting core review label Aug 27, 2017

JelleZijlstra added 4 commits August 27, 2017 13:53

reword docs

805e0cd

create __new__ docstring separately

1f13a26

set __module__ and __qualname__

af0c6bf

add NEWS entry

c28d4e4

lots of f-strings

d73ae9d

eric-wieser reviewed Sep 2, 2017

View reviewed changes

emmatyping reviewed Sep 3, 2017

View reviewed changes

rhettinger closed this Sep 8, 2017

rhettinger mentioned this pull request Sep 8, 2017

bpo-28638: Optimize namedtuple() creation time by minimizing use of exec() #3454

Merged

Mariatta removed the awaiting core review label Oct 8, 2017

		@@ -0,0 +1,2 @@
		Speed up namedtuple class creation by 4x by avoiding usage of exec(). Patch

Uh oh!

bpo-28638: speed up namedtuple creation by avoiding exec #2736

bpo-28638: speed up namedtuple creation by avoiding exec #2736

Uh oh!

Conversation

JelleZijlstra commented Jul 16, 2017 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mention-bot commented Jul 16, 2017

Uh oh!

ilevkivskyi commented Jul 17, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rhettinger commented Jul 18, 2017

Uh oh!

JelleZijlstra commented Jul 18, 2017

Uh oh!

ilevkivskyi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

methane commented Jul 18, 2017

Uh oh!

methane commented Jul 18, 2017

Uh oh!

methane commented Jul 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pitrou commented Jul 18, 2017

Uh oh!

ilevkivskyi commented Jul 18, 2017

Uh oh!

JelleZijlstra commented Jul 19, 2017

Uh oh!

methane commented Jul 19, 2017

Uh oh!

JelleZijlstra commented Jul 16, 2017 •

edited by bedevere-bot

Loading

methane commented Jul 18, 2017 •

edited

Loading

mlouielu commented Jul 19, 2017 •

edited

Loading

methane commented Jul 19, 2017 •

edited

Loading