bpo-28638: Optimize namedtuple() creation time by minimizing use of exec() #3454

rhettinger · 2017-09-08T07:16:31Z

Current version passes tests but doesn't not support the _source attribute or the verbose option.

Obtains a 5x creation time speed-up and with small positive impacts on post creation performance.

=============== Baseline ===============
Cumulative time over 10,000 iterations:
0.044 arg_checking
0.238 templating
2.907 exec
3.257 all
Total with call overhead: 3.270 seconds

===== This verion ======
Cumulative time over 10,000 iterations:
0.073 arg_checking
0.306 template_and_exec
0.207 non_exec
0.626 all
Total with call overhead: 0.639 seconds

Ideas going forward

Cache the itemgetter instances
Generate _source dynamically
Need to compare to Jelle's patch

https://bugs.python.org/issue28638

methane · 2017-09-08T11:32:27Z

Lib/collections/__init__.py

+    )
+    for index, name in enumerate(field_names):
+        class_namespace[name] = property(fget = reuse_itemgetter(index),
+                                         doc = f'Alias for field number {index}')


Nice hack! How about reusing docstring too?

Yes, that is a great idea. Adding: doc = _sys.intern(f'Alias for field number {index}')

rhettinger · 2017-09-08T21:06:31Z

Incorporating by reference all the comments in the closed PR: #2736

The two patches are mostly synced. The principal differences are:

Dropped _source and verbose
Better use of closure variables as a fast way to pass fixed values into the methods
Use dict() notation versus {k:v} style
Early conversion of field_names to a tuple so this gets done just once
Re-use itemgetter objects and intern the docstrings
Better comments of sections and note where exec interns for us
Compact single-line template for __new__

serhiy-storchaka

Since this is a compatibility-breaking change, it should be documented in corresponding section of What's New.

serhiy-storchaka · 2017-09-08T21:58:23Z

Lib/collections/__init__.py

-        field_defs = '\n'.join(_field_template.format(index=index, name=name)
-                               for index, name in enumerate(field_names))
+    # Variables used in the methods and docstrings
+    field_names = tuple(field_names)


Intern names. Comparing identical strings is faster. In the current implementation the compiler does this for us.

emmatyping · 2017-09-08T23:02:06Z

Lib/collections/__init__.py

+    # Create all the named tuple methods to be added to the class namespace
+
+    s = f'def __new__(_cls, {arg_list}): return _tuple_new(_cls, ({arg_list}))'
+    namespace = dict(_tuple_new=tuple_new, __name__=module_name)


Could you explain why you chose dict() over {k:v}? I believe {k:v} is faster because it doesn't need to make a function call.

Beauty counts.

Fair enough. Now that I think about it, I wonder if optimizing it in peephole would be a good thing? The pattern is pretty consistent LOAD_GLOBAL (dict) (LOAD_CONST arg names and values) CALL_FUNCTION_KW.

Can't do it because builtins.dict can be replaced by a user or shadowed in globals(). Just the facts of life in a very dynamic language.

Of course. Thanks for the explanation! I should have guessed there was good reason for a "simple improvement" such as that not being implemented.

methane · 2017-09-09T00:01:06Z

Lib/collections/__init__.py

+            return _nt_itemgetters[index]
+        except KeyError:
+            getter = _nt_itemgetters[index] = _itemgetter(index)
+            return getter


How about caching property?

doc = f'Alias for field number {index}' prop = _nt_props[index] = property(_itemgetter(index), doc=doc) return prop

Can't cache property() the docstring is mutable and can updated by the user, so the property objects need to be distinct.

Oh I didn't realized it.
Anyway, I prefer caching docstring to sys.intern, because sys.intern approach
requires temporal new string and hashing it.
But that's just my preference and there are no real problem.

…dtuple

JelleZijlstra · 2017-09-09T05:20:04Z

Lib/collections/__init__.py

        if name in seen:
-            raise ValueError('Encountered duplicate field name: %r' % name)
+            raise ValueError('Encountered duplicate field name: {name!r}')


missing an f

Fixed. Thanks for noticing.

serhiy-storchaka · 2017-09-09T06:25:39Z

Lib/collections/__init__.py

+
+    for method in (__new__, _make.__func__, _replace,
+                   __repr__, _asdict, __getnewargs__):
+        method.__module__ = module_name


This should be the same as the __module__ attribute of the class. But the latter is defined at the end of this function, so the code needs reordering.

I'm wondering whether the method.__module__ assignment should be removed entirely. It doesn't seem to be needed for pickling. For introspection purposes, the code is actually defined in collections.__init__ rather than in the caller's namespace.

It is needed for pickling these methods.

This change is not required for this PR since pickling methods is not supported in the current implementation. But I think it is easy to add pickle support while we rewrite this code.

serhiy-storchaka · 2017-09-09T18:57:11Z

Lib/collections/__init__.py

+
+    for method in (__new__, _make.__func__, _replace,
+                   __repr__, _asdict, __getnewargs__):
+        method.__module__ = module_name


It is needed for pickling these methods.

This change is not required for this PR since pickling methods is not supported in the current implementation. But I think it is easy to add pickle support while we rewrite this code.

serhiy-storchaka · 2017-09-09T19:03:22Z

Doc/whatsnew/3.7.rst

+  or ``_source`` attribute which showed the generated source code for the
+  named tuple class.  This was part of an optimization designed to speed-up
+  class creation.  (Contributed by Jelle Zijlstra with further improvements
+  by INADA Naoki, Serhiy Storchaka, and Raymond Hettinger in :issue:`28638`.)


I suppose here should be "Naoki Inada" instead of "INADA Naoki".

Perhaps. That is for him to decide.

While I don't care in general, I prefer "INADA Naoki" here because I use it in git config "user.name".

serhiy-storchaka · 2017-09-09T19:06:20Z

Doc/whatsnew/3.7.rst

@@ -435,6 +435,12 @@ API and Feature Removals
  Python 3.1, and has now been removed.  Use the :func:`~os.path.splitdrive`
  function instead.

+* :func:`collections.namedtuple` no longer supports the *verbose* parameter
+  or ``_source`` attribute which showed the generated source code for the
+  named tuple class.  This was part of an optimization designed to speed-up


I think it is worth to add a note about an optimization in the "Optimizations" section.

I don't want to litter the docs with more notes about this. The more important part to note is the feature removal in whatsnew. The optimization is mostly negligible and insignificant to most users (normal startup doesn't use named tuples at all; a module using one named tuple was using only 0.3ms which was just under 1% of the overall start-up time; formerly, we could make 3 namedtuple classes per millisecond and now we can make about 15). It is one of the least important optimizations we could have done.

… the code

serhiy-storchaka

LGTM except writing Naoki's names.

rhettinger · 2017-09-10T01:34:30Z

Breakdown of time to create a named tuple class using the current patch:

9μs for the argument checking and creation of variables used in the methods and docstrings
32μs for creating __new__ using templating and exec()
2μs for making all the other methods using def
2μs for adding __qualname__ to all the methods
20μs for making the class dict, making the properties, and calling type() to build the class
3μs for function call overhead, computing __module__, and timing overhead

The total of 68μs with the patch compares to 327 μs without the patch, giving an approx 5x speed-up for named tuple class creation. There should be some memory savings (_source is no longer stored, the code objects for all methods with __new__ are reused, and the itemgetter objects are reused).

For the actual operation of named tuple instances, there are small (hard to measure) differences in performance:

2% better for __new__() at about 0.4μs per call (with 3 fields)
1% better for _replace() at about 1.2μs per call (with 2 out of 3 fields replaced)
6% worse for _make() at about 0.4μs per call (with 3 fields)
0% change for _asdict() at about 1.2μs per call (with 3 fields)

… helper function

serhiy-storchaka · 2017-09-10T07:10:39Z

Lib/collections/__init__.py

+            itemgetter_object = cache[index]
+        except KeyError:
+            itemgetter_object = cache[index] = _itemgetter(index)
+        doc = _sys.intern(f'Alias for field number {index}')


Maybe use the same cache for docstrings?

try: itemgetter_object, doc = cache[index] except KeyError: itemgetter_object, doc = cache[index] = _itemgetter(index), f'Alias for field number {index}' class_namespace[name] = property(itemgetter_object, doc=doc)

No need. sys.intern() takes care of the caching for us and does so at C-speed. And though it probably isn't needed, it can share strings across the whole system including hand-rolled named tuples.

Actually using sys.intern() makes namedtuple class creation up to 10% slower due to the cost of string formatting, attribute lookup and function call.

Additionally, adding strings that are not attribute or parameter names to an interned strings hashtable increases the size of a hashtable and the probability of hash collisions.

Lib/collections/__init__.py

gvanrossum · 2017-09-10T18:58:40Z

Congrats Raymond and everyone who contributed to this major improvement!

(To whoever merged this PR, next time please clean up the commit description (it's too late for this one, commits pushed to GitHub are forever). Core committers who merge PRs are responsible for producing a readable commit message. The current message is just the concatenation of all local commits, including things like "Neaten-up a bit". Hopefully the news blurb is better.)

methane · 2017-09-11T04:53:48Z

Congrats!
import inspect (asyncio imports it) time is reduced from about 19ms to 16ms.

Working draft without _source

baca675

rhettinger added awaiting changes performance Performance or resource usage labels Sep 8, 2017

rhettinger self-assigned this Sep 8, 2017

bedevere-bot removed the awaiting changes label Sep 8, 2017

the-knights-who-say-ni added the CLA signed label Sep 8, 2017

bedevere-bot added the awaiting merge label Sep 8, 2017

rhettinger added 3 commits September 8, 2017 01:06

Re-use itemgetter() instances

3a2aa41

Speed-up calls to __new__() with a pre-bound tuple.__new__()

e24ef4b

Add note regarding string interning

49aa967

methane reviewed Sep 8, 2017

View reviewed changes

rhettinger added 8 commits September 8, 2017 09:51

Remove unnecessary create function wrappers

c34b444

Minor sync-ups with PR-2736. Mostly formatting and f-strings

0c7f163

Bring-in qualname/__module fix-ups from PR-2736

5de0b7f

Formally remove the verbose flag and _source attribute

ca643f4

Restore a test of potentially problematic field names

e18b92e

Restore kwonly_args test but without the verbose option

0851fc7

Adopt Inada's idea to reuse the docstrings for the itemgetters

3ac1151

Neaten-up a bit

b31f063

rhettinger mentioned this pull request Sep 8, 2017

bpo-28638: speed up namedtuple creation by avoiding exec #2736

Closed

Add news blurb

deb30a2

rhettinger changed the title ~~bpo-28638: Optimize namedtuple() creation time (Work-in-progress)~~ bpo-28638: Optimize namedtuple() creation time by minimizing use of exec() Sep 8, 2017

serhiy-storchaka self-requested a review September 8, 2017 21:52

serhiy-storchaka reviewed Sep 8, 2017

View reviewed changes

emmatyping reviewed Sep 8, 2017

View reviewed changes

bedevere-bot added awaiting core review and removed awaiting merge labels Sep 8, 2017

methane reviewed Sep 9, 2017

View reviewed changes

Merge branch 'master' of github.com:python/cpython into optimize-name…

f402cb3

…dtuple

Serhiy pointed-out the need for interning

86cef9e

JelleZijlstra reviewed Sep 9, 2017

View reviewed changes

Jelle noticed as missing f on an f-string

83b5e93

serhiy-storchaka reviewed Sep 9, 2017

View reviewed changes

rhettinger added 2 commits September 9, 2017 10:50

Add whatsnew entry for feature removal

0168317

Accede to request for dict literals instead keyword arguments

dadafc0

serhiy-storchaka reviewed Sep 9, 2017

View reviewed changes

Leave the method.__module__ attribute pointing the actual location of…

483d08c

… the code

serhiy-storchaka approved these changes Sep 9, 2017

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting core review labels Sep 9, 2017

rhettinger added 3 commits September 9, 2017 21:57

Improve variable names and add a micro-optimization for an non-public…

bc6852c

… helper function

Simplify by in-lining reuse_itemgetter()

c897674

Arrange steps in more logical order

bd4ea4e

serhiy-storchaka reviewed Sep 10, 2017

View reviewed changes

Save docstring in local cache instead of interning

ffb78c9

serhiy-storchaka approved these changes Sep 10, 2017

View reviewed changes

Lib/collections/__init__.py Show resolved Hide resolved

rhettinger merged commit 8b57d73 into python:master Sep 10, 2017

rhettinger deleted the optimize-namedtuple branch September 10, 2017 17:23

Mariatta removed the awaiting merge label Oct 8, 2017

yan12125 mentioned this pull request Jul 10, 2018

[postponed] tox/Travis: test against Python 3.7 davidhalter/jedi#1161

Closed

Uh oh!

bpo-28638: Optimize namedtuple() creation time by minimizing use of exec() #3454

bpo-28638: Optimize namedtuple() creation time by minimizing use of exec() #3454

Uh oh!

Conversation

rhettinger commented Sep 8, 2017 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Ideas going forward

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rhettinger commented Sep 8, 2017

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

emmatyping Sep 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

rhettinger commented Sep 10, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gvanrossum commented Sep 10, 2017

Uh oh!

methane commented Sep 11, 2017

Uh oh!

Uh oh!

rhettinger commented Sep 8, 2017 •

edited by bedevere-bot

Loading

emmatyping Sep 8, 2017 •

edited

Loading