bpo-45508: Specialize INPLACE_ADD #29024

sweeneyde · 2021-10-18T09:50:46Z

Pyperformance/Specialization results: https://gist.github.com/sweeneyde/41a76356e875e2a98d16ce5410ab41c0

https://bugs.python.org/issue45508

bedevere-bot · 2021-10-18T10:02:49Z

🤖 New build scheduled with the buildbot fleet by @sweeneyde for commit 62c1d38 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

sweeneyde · 2021-10-18T10:17:39Z

The s390x RHEL7 LTO PR failure is https://bugs.python.org/issue45484

markshannon · 2021-10-18T10:18:05Z

You can remove unicode_concatenate

markshannon · 2021-10-18T13:07:14Z

Any performance numbers or specialization stats for this?

markshannon · 2021-10-18T13:07:52Z

Maybe I should read the first comment 🙂

markshannon · 2021-10-18T13:22:00Z

I'd like to hold off merging this until we have a better way to handle classes like decimal.
I'll look into a fix for BINARY_ADD that we then apply here.

sweeneyde · 2021-10-22T06:46:43Z

I merged with main and ran some microbenchmarks, and it seems decimal += decimal is not that bad.

Microbenchmark program:

from pyperf import Runner

runner = Runner()

runner.timeit("int+=int",
    setup="from itertools import repeat",
    stmt="x = 0\n"
         "for y in repeat(1, 10_000):\n"
         "    x += y; x += y; x += y; x += y; x += y"
    )
runner.timeit("float+=float",
    setup="from itertools import repeat",
    stmt="x = 0.0\n"
         "for y in repeat(1.0, 10_000):\n"
         "    x += y; x += y; x += y; x += y; x += y"
    )
runner.timeit("str+=str",
    setup="from itertools import repeat",
    stmt="for y in repeat('a', 10_000):\n"
         "    x = ''; x += y; x += y; x += y; x += y; x += y"
    )
runner.timeit("list[0]+=str",
    setup="from itertools import repeat",
    stmt="x = [None]\n"
         "for y in repeat('a', 10_000):\n"
         "    x[0] = ''; x[0] += y; x[0] += y; x[0] += y; x[0] += y; x[0] += y"
    )
runner.timeit("float+=int",
    setup="from itertools import repeat",
    stmt="x = 0.0\n"
         "for y in repeat(1, 10_000):\n"
         "    x += y; x += y; x += y; x += y; x += y"
    )
runner.timeit("decimal+=decimal",
    setup="from itertools import repeat; from decimal import Decimal as D",
    stmt="x = D(0)\n"
         "for y in repeat(D(1), 10_000):\n"
         "    x += y; x += y; x += y; x += y; x += y"
    )
runner.timeit("list[0]+=1",
    setup="from itertools import repeat; from collections import defaultdict",
    stmt="dd = [0]\n"
         "for y in repeat(1, 10_000):\n"
         "    dd[0] += y; dd[0] += y; dd[0] += y; dd[0] += y; dd[0] += y",
    )
runner.timeit("defaultdict(int)[0]+=1",
    setup="from itertools import repeat; from collections import defaultdict",
    stmt="dd = defaultdict(int)\n"
         "for y in repeat(1, 10_000):\n"
         "    dd[0] += y; dd[0] += y; dd[0] += y; dd[0] += y; dd[0] += y",
    )

Results from PGO on MSVC:

Benchmark	main_inplace_add_micro	specialized_inplace_add_micro
float+=float	1.18 ms	988 us: 1.19x faster
int+=int	1.34 ms	1.16 ms: 1.15x faster
str+=str	2.08 ms	1.80 ms: 1.15x faster
list[0]+=1	2.47 ms	2.39 ms: 1.03x faster
list[0]+=str	3.62 ms	3.67 ms: 1.01x slower
defaultdict(int)[0]+=1	3.29 ms	3.43 ms: 1.04x slower
float+=int	1.64 ms	1.74 ms: 1.06x slower
Geometric mean	(ref)	1.05x faster

Benchmark hidden because not significant (1): decimal+=decimal

Results from PGO on GCC (WSL):

Benchmark	main_inplace_add_micro2	specialized_inplace_add_micro2
float+=float	885 us	672 us: 1.32x faster
int+=int	1.14 ms	987 us: 1.16x faster
defaultdict(int)[0]+=1	2.61 ms	2.49 ms: 1.05x faster
list[0]+=1	2.03 ms	1.96 ms: 1.03x faster
float+=int	1.36 ms	1.32 ms: 1.03x faster
str+=str	1.39 ms	1.45 ms: 1.04x slower
list[0]+=str	2.75 ms	2.89 ms: 1.05x slower
Geometric mean	(ref)	1.06x faster

Benchmark hidden because not significant (1): decimal+=decimal

markshannon · 2021-11-10T12:34:13Z

Once #29482 is merged this will need to reworked for the Nth (and hopefully last) time. Sorry about that.
On the bright side, it should save some boilerplate.

sweeneyde · 2021-11-10T14:46:30Z

I think PR 29482 will just make this obsolete -- it manages to re-use *_ADD opcodes for INPLACE_ADD opcodes (nice!):

(In specialize.c)
        case NB_ADD:
        case NB_INPLACE_ADD:
            if (PyUnicode_CheckExact(lhs)) {
                if (_Py_OPCODE(instr[1]) == STORE_FAST && Py_REFCNT(lhs) == 2) {
                    *instr = _Py_MAKECODEUNIT(BINARY_OP_INPLACE_ADD_UNICODE,
                                              _Py_OPARG(*instr));
                    goto success;
                }
                *instr = _Py_MAKECODEUNIT(BINARY_OP_ADD_UNICODE,
                                          _Py_OPARG(*instr));
                goto success;
            }

sweeneyde added 2 commits October 16, 2021 23:38

Initial implementation of INPLACE_MULTIPLY specialization

74ceb37

Merge branch 'main' into inplace_add

b366a06

sweeneyde requested a review from markshannon as a code owner October 18, 2021 09:50

bedevere-bot added the awaiting review label Oct 18, 2021

the-knights-who-say-ni added the CLA signed label Oct 18, 2021

📜🤖 Added by blurb_it.

62c1d38

sweeneyde added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 18, 2021

bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 18, 2021

Remove unused unicode_concatenate

89ea6ba

markshannon self-assigned this Oct 18, 2021

corona10 mentioned this pull request Oct 18, 2021

bpo-45510: Specialize BINARY_SUBTRACT #29010

Closed

sweeneyde added 2 commits October 21, 2021 22:44

merge with main

48bdcf9

remove record_hit_inline calls

e4a1871

sweeneyde closed this Nov 11, 2021

sweeneyde deleted the inplace_add branch December 19, 2021 05:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

bpo-45508: Specialize INPLACE_ADD #29024

bpo-45508: Specialize INPLACE_ADD #29024

Uh oh!

sweeneyde commented Oct 18, 2021 •

edited by bedevere-bot

Loading

Uh oh!

bedevere-bot commented Oct 18, 2021

Uh oh!

sweeneyde commented Oct 18, 2021

Uh oh!

markshannon commented Oct 18, 2021

Uh oh!

markshannon commented Oct 18, 2021

Uh oh!

markshannon commented Oct 18, 2021

Uh oh!

markshannon commented Oct 18, 2021

Uh oh!

sweeneyde commented Oct 22, 2021

Uh oh!

markshannon commented Nov 10, 2021

Uh oh!

sweeneyde commented Nov 10, 2021

Uh oh!

Uh oh!

Uh oh!

bpo-45508: Specialize INPLACE_ADD #29024

bpo-45508: Specialize INPLACE_ADD #29024

Uh oh!

Conversation

sweeneyde commented Oct 18, 2021 • edited by bedevere-bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-bot commented Oct 18, 2021

Uh oh!

sweeneyde commented Oct 18, 2021

Uh oh!

markshannon commented Oct 18, 2021

Uh oh!

markshannon commented Oct 18, 2021

Uh oh!

markshannon commented Oct 18, 2021

Uh oh!

markshannon commented Oct 18, 2021

Uh oh!

sweeneyde commented Oct 22, 2021

Uh oh!

markshannon commented Nov 10, 2021

Uh oh!

sweeneyde commented Nov 10, 2021

Uh oh!

Uh oh!

sweeneyde commented Oct 18, 2021 •

edited by bedevere-bot

Loading