Add a feature for using the same number of loops as a previous run #327
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation:
On the Faster CPython team, we often collect pystats (counters of various interpreter events) by running the benchmark suite. It is very useful to compare the stats between two commits to see how a pull request affects the interpreter. Unfortunately, with pyperformance's default behavior where the number of loops is automatically calibrated, each benchmark may not be run the same number of times from run-to-run, making the data hard to compare.
This change adds a new argument to the
run
command which will use the same number of loops as a previous run. Theloops
for each benchmark is looked up from the metadata in the .json output of that previous run, and passed to the underlying call topyperf
using the--loops
argument.Additionally, this modifies one of the benchmarks,
sqlglot
to be compatible with that scheme.sqlglot
is the onlyrun_benchmark.py
script that runs multiple benchmarks within it in a single call to the script. This makes it impossible to set the number of loops independently for each of these benchmarks. It's been updated to use the pattern from other "suites" of benchmarks (e.g.async_tree
) where each benchmark has its own.toml
file and is run independently. This should still be backward compatible with older data collected from this benchmark, but doingpyperformance run -b sqlglot
will now only run a single benchmark.