Display arithmetic mean of benchmarks in `compare` site #1125

Kobzol · 2021-12-20T15:26:02Z

When some change produces both a lot of improvements and and a lot of regressions, it might be quite difficult to guess the relative size of the improvements/regression just from a glance. The page shows the number of improvements/regressions, but there is no aggregated value of their scales.

What about including a geometric mean of the speedups/slowdowns to provide a quick hint of how did the change impact the benchmarks in general?

I put together a quick PoC just to kickstart a discusion (the visual design should be different and I'm not sure if the implementation is fully correct).

rylev

Thanks for kicking this off! I have a couple of thoughts:

Labeling this simply as a "geometric mean" is potentially not the most user friendly as it requires the viewer has some sort of background knowledge in stats. I think we can leave it like it is, but we should probably add some sort of tool tip describing the intuition the viewer should have.
There has been discussion about how not all benchmarks mean the same thing (i.e., some benchmarks are real world crates, some are stress tests, etc.). It would be interesting to have a breakdown based on those categories. I believe others are looking into this categorization.
Visually, I think at a minimum we'll want to center the mean underneath the other numbers.

Kobzol · 2022-01-04T19:11:46Z

Regarding the categorization: apart from looking at realworld/artificial benchmarks separately, it would be also nice to quickly filter e.g. all doc/check/debug/opt benchmarks. I know that it can be done with the filter, but it's faster if you could click on some checkboxes.

The geometric mean should probably be (re)computed only for the filtered and currently displayed benchmarks, right?

Kobzol · 2022-02-22T23:09:27Z

@rylev Inspired by your recent change that calculates average relevant improvements/regressions in perf summaries on PRs, I changed it like this:

Now average regression, average improvement and total average is shown. Maybe some tooltip could be added to explain this (where?).

I used arithmetic average for the mean, although I still think that geometric mean is a better fit (both here and in the summary). But these two averages probably won't be that different for % diffs that we usually get on PRs.

I also did some slight refactorings (summary calculation), fixes (missing </div>) and style changes. I reversed the arrow SVG in the summary - in all cases we consider a lower value to be better, so in improvements I suppose that the arrow should go down? But that's mostly unrelated to this PR, let me know if I should revert that.

nnethercote · 2022-02-23T04:37:04Z

I reversed the arrow SVG in the summary - in all cases we consider a lower value to be better, so in improvements I suppose that the arrow should go down? But that's mostly unrelated to this PR, let me know if I should revert that.

+1 for that, it has been bugging me 😄

rylev

Awesome!

Maybe some tooltip could be added to explain this (where?).

I'm not sure, but maybe this can be another ? tooltip that lives in the upper right corner of the summary box. I don't think we should block merging this though.

klensy · 2022-02-23T12:12:03Z

Yeah. tooltip will be good, as currently isn't obvious whats than numbers mean.

Kobzol · 2022-02-24T08:15:35Z

@rylev @nnethercote https://perf.rust-lang.org/compare.html?start=8ebec97e09a89760e5791bbb2ab96e2ebec19931&end=e780264e1e5c1efa6ab76c7b17a9677f16add5e0 this result looks a bit suspicious with the arithmetic mean. I would like to try to compute the geometric mean for this data to see how it would look like, but it's a bit cumbersome to change the code of the live website. Is there a way to download the benchmark results from perf.rlo as a SQlite file for example?

Mark-Simulacrum requested a review from rylev December 21, 2021 13:16

rylev requested changes Jan 3, 2022

View reviewed changes

Kobzol force-pushed the geometric-mean branch from 059fcda to 13a37d5 Compare February 22, 2022 23:09

Display mean of benchmarks in compare site

fcc71ce

Kobzol force-pushed the geometric-mean branch from 13a37d5 to fcc71ce Compare February 22, 2022 23:10

rylev approved these changes Feb 23, 2022

View reviewed changes

rylev merged commit e6eda98 into rust-lang:master Feb 23, 2022

Kobzol deleted the geometric-mean branch February 23, 2022 09:46

Kobzol changed the title ~~Display geometric mean of benchmarks in compare site~~ Display arithmetic mean of benchmarks in compare site Feb 23, 2022

Kobzol mentioned this pull request Feb 23, 2022

Remove unrelated directory from .gitignore #1178

Merged

Kobzol mentioned this pull request Feb 23, 2022

Improve design of compare site #1179

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Display arithmetic mean of benchmarks in `compare` site #1125

Display arithmetic mean of benchmarks in `compare` site #1125

Uh oh!

Kobzol commented Dec 20, 2021

Uh oh!

rylev left a comment

Uh oh!

Kobzol commented Jan 4, 2022

Uh oh!

Kobzol commented Feb 22, 2022

Uh oh!

nnethercote commented Feb 23, 2022 •

edited

Loading

Uh oh!

rylev left a comment

Uh oh!

klensy commented Feb 23, 2022

Uh oh!

Kobzol commented Feb 24, 2022

Uh oh!

Uh oh!

Display arithmetic mean of benchmarks in compare site #1125

Display arithmetic mean of benchmarks in compare site #1125

Uh oh!

Conversation

Kobzol commented Dec 20, 2021

Uh oh!

rylev left a comment

Choose a reason for hiding this comment

Uh oh!

Kobzol commented Jan 4, 2022

Uh oh!

Kobzol commented Feb 22, 2022

Uh oh!

nnethercote commented Feb 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rylev left a comment

Choose a reason for hiding this comment

Uh oh!

klensy commented Feb 23, 2022

Uh oh!

Kobzol commented Feb 24, 2022

Uh oh!

Uh oh!

Display arithmetic mean of benchmarks in `compare` site #1125

Display arithmetic mean of benchmarks in `compare` site #1125

nnethercote commented Feb 23, 2022 •

edited

Loading