Skip to content

bpo-45510: Specialize BINARY_SUBTRACT #29010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

corona10
Copy link
Member

@corona10 corona10 commented Oct 17, 2021

@corona10 corona10 added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 17, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @corona10 for commit fb360a2b562300ac17a10c49b43e0f496f51a0ee 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 17, 2021
@corona10 corona10 changed the title [WIP + Experiment] Implement Specializing BINARY_SUBTRACT [Experiment] Implement Specializing BINARY_SUBTRACT Oct 18, 2021
@corona10 corona10 changed the title [Experiment] Implement Specializing BINARY_SUBTRACT bpo-45510: Specialize BINARY_SUBTRACT Oct 18, 2021
@corona10 corona10 marked this pull request as ready for review October 18, 2021 11:19
@corona10 corona10 added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 18, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @corona10 for commit b4608cca7dccaf21358e4da854d0701cbe0d0e01 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Oct 18, 2021
@markshannon markshannon self-assigned this Oct 18, 2021
@corona10
Copy link
Member Author

corona10 commented Oct 18, 2021

The reason of build bot failure looks same with #29024
So unrelated with this PR.

@corona10 corona10 force-pushed the BINARY_SUBTRACT_specialize branch 2 times, most recently from ad423f5 to a010e2d Compare October 19, 2021 23:45
@corona10
Copy link
Member Author

@markshannon Rebased!

@markshannon
Copy link
Member

Sorry about the conflicts. I wanted to get both the CALL_FUNCTION specializations PRs merged.

@corona10 corona10 force-pushed the BINARY_SUBTRACT_specialize branch from a010e2d to 6b27e1f Compare October 21, 2021 02:05
@corona10
Copy link
Member Author

corona10 commented Oct 21, 2021

Sorry about the conflicts. I wanted to get both the CALL_FUNCTION specializations PRs merged.

I can do this all day :)

@markshannon
Copy link
Member

A marginal speedup
Note that the top three speedups are probably just random variations.

My concern with this change is that it will slow down any subtraction on anything other than pairs of ints and floats.
I'd like to see some microbenchmarks for int - float, float - int, and decimal - decimal.

Note that BINARY_ADD has the same issue. If we can fix that, then we could use the same approach here.

@pablogsal
Copy link
Member

A marginal speedup
Note that the top three speedups are probably just random variations.

My concern with this change is that it will slow down any subtraction on anything other than pairs of ints and floats.
I'd like to see some microbenchmarks for int - float, float - int, and decimal - decimal.

Note that BINARY_ADD has the same issue. If we can fix that, then we could use the same approach here.

I share the same concerns

@corona10
Copy link
Member Author

@markshannon cc @pablogsal @Fidget-Spinner

My concern with this change is that it will slow down any subtraction on anything other than pairs of ints and floats.

I definitely agree with your concern. So do I, At first, I thought that it's okayish since we already have the issue with BINARY_ADD. So if this change is not worth to applied, let's stale the PR until the issue is solved.

I'd like to see some microbenchmarks for int - float, float - int, and decimal - decimal.

Whether merge this PR or not, it's worth measuring such benchmarks, we can get some data from this experiment.
I will share the result after writing some scripts.

@corona10
Copy link
Member Author

corona10 commented Oct 22, 2021

@markshannon For your data

Benchmark specialize_baseline specialize
int - int 30.1 ns 26.5 ns: 1.13x faster
float - float 24.1 ns 20.2 ns: 1.19x faster
int - float 37.6 ns 36.6 ns: 1.03x faster
decimal - decimal 63.6 ns 71.1 ns: 1.12x slower
Geometric mean (ref) 1.04x faster

Benchmark hidden because not significant (1): float - int

from pyperf import Runner

runner = Runner()

runner.timeit("int - int",
    setup="""
import random
a = random.randint(1, 1000000)
b = random.randint(1, 1000000)
""",
    stmt="""
x = a - b
""")
runner.timeit("float - float",
    setup="""
import random
a = random.uniform(1.5, 3.9)
b = random.uniform(1.5, 3.9)
""",
    stmt="""
x = a - b
"""
    )
runner.timeit("int - float",
    setup="""
import random
a = random.randint(1, 100000)
b = random.uniform(1.5, 3.9)
""",
    stmt="""
x = a - b
"""
    )
runner.timeit("float - int",
    setup="""
import random
a = random.randint(1, 100000)
b = random.uniform(1.5, 3.9)
""",
    stmt="""
x = b - a
"""
    )

runner.timeit("decimal - decimal",
    setup="""
import random
from decimal import Decimal as D
a = D(random.randint(1, 100000))
b = D(random.randint(1, 100000))
""",
    stmt="""
x = a - b
"""
    )

@corona10
Copy link
Member Author

corona10 commented Oct 22, 2021

@markshannon

Experiment 2

  • concept: If either left or right is float handle as both float.
  • BINARY_SUBTRACT_INT + BINARY_SUBTRACT_FLOAT = BINARY_SUBTRACT_FAST
  • This implementation is a kind of workaround so you may not want but if we only handle (int, float) variation, it will work.
  • Need to handle PyFloat_FromDouble is failed and stats handling is needed(This is just PoC)
  • c492437
Benchmark specialize_baseline specialize2
int - int 30.1 ns 27.5 ns: 1.09x faster
float - float 24.1 ns 20.7 ns: 1.16x faster
int - float 37.6 ns 22.5 ns: 1.67x faster
decimal - decimal 63.6 ns 67.9 ns: 1.07x slower
Geometric mean (ref) 1.15x faster

Benchmark hidden because not significant (1): float - int

@corona10
Copy link
Member Author

corona10 commented Oct 22, 2021

Unreliable my VM test with experiment2. :)

Benchmark subtract_baseline subtract_fast
2to3 421 ms 412 ms: 1.02x faster
chaos 112 ms 104 ms: 1.07x faster
deltablue 6.87 ms 6.69 ms: 1.03x faster
fannkuch 575 ms 583 ms: 1.01x slower
hexiom 9.91 ms 9.79 ms: 1.01x faster
json_dumps 18.0 ms 17.5 ms: 1.03x faster
logging_format 12.7 us 11.5 us: 1.11x faster
logging_simple 11.4 us 10.2 us: 1.12x faster
meteor_contest 141 ms 147 ms: 1.04x slower
nqueens 121 ms 117 ms: 1.04x faster
pathlib 34.3 ms 30.3 ms: 1.13x faster
pickle 15.2 us 14.6 us: 1.04x faster
pickle_list 5.46 us 5.36 us: 1.02x faster
python_startup 14.3 ms 14.1 ms: 1.01x faster
python_startup_no_site 9.74 ms 9.50 ms: 1.02x faster
raytrace 482 ms 450 ms: 1.07x faster
regex_compile 212 ms 201 ms: 1.05x faster
regex_dna 261 ms 254 ms: 1.03x faster
regex_effbot 4.17 ms 4.05 ms: 1.03x faster
regex_v8 32.0 ms 32.5 ms: 1.02x slower
scimark_sor 218 ms 212 ms: 1.03x faster
spectral_norm 154 ms 152 ms: 1.01x faster
telco 8.63 ms 8.24 ms: 1.05x faster
unpack_sequence 63.6 ns 60.9 ns: 1.04x faster
unpickle 20.9 us 19.3 us: 1.08x faster
unpickle_pure_python 423 us 375 us: 1.13x faster
xml_etree_parse 221 ms 210 ms: 1.05x faster
xml_etree_iterparse 153 ms 150 ms: 1.02x faster
xml_etree_generate 118 ms 115 ms: 1.03x faster
xml_etree_process 86.0 ms 82.6 ms: 1.04x faster
Geometric mean (ref) 1.03x faster

Benchmark hidden because not significant (16): float, go, json_loads, logging_silent, nbody, pickle_dict, pickle_pure_python, pidigits, pyflate, richards, scimark_fft, scimark_lu, scimark_monte_carlo, scimark_sparse_mat_mult, sqlite_synth, unpickle_list

@markshannon
Copy link
Member

Will need reworking once #29482 is merged. Sorry to make you redo this, yet again 🙁

I think the end result will be better though, as we won't need yet another ADAPTIVE instruction.

@corona10
Copy link
Member Author

corona10 commented Nov 10, 2021

@markshannon
#29482 looks great approach, I will create the new PR once the #29482 is merged.
Thanks for letting me know.

@brandtbucher
Copy link
Member

Should this PR be closed (in favor of #29523)?

@corona10 corona10 closed this Nov 12, 2021
@corona10 corona10 deleted the BINARY_SUBTRACT_specialize branch February 15, 2024 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants