-
Notifications
You must be signed in to change notification settings - Fork 156
Display arithmetic mean of benchmarks in compare
site
#1125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for kicking this off! I have a couple of thoughts:
- Labeling this simply as a "geometric mean" is potentially not the most user friendly as it requires the viewer has some sort of background knowledge in stats. I think we can leave it like it is, but we should probably add some sort of tool tip describing the intuition the viewer should have.
- There has been discussion about how not all benchmarks mean the same thing (i.e., some benchmarks are real world crates, some are stress tests, etc.). It would be interesting to have a breakdown based on those categories. I believe others are looking into this categorization.
- Visually, I think at a minimum we'll want to center the mean underneath the other numbers.
Regarding the categorization: apart from looking at realworld/artificial benchmarks separately, it would be also nice to quickly filter e.g. all The geometric mean should probably be (re)computed only for the filtered and currently displayed benchmarks, right? |
059fcda
to
13a37d5
Compare
@rylev Inspired by your recent change that calculates average relevant improvements/regressions in perf summaries on PRs, I changed it like this: Now average regression, average improvement and total average is shown. Maybe some tooltip could be added to explain this (where?). I used arithmetic average for the mean, although I still think that geometric mean is a better fit (both here and in the summary). But these two averages probably won't be that different for % diffs that we usually get on PRs. I also did some slight refactorings (summary calculation), fixes (missing |
13a37d5
to
fcc71ce
Compare
+1 for that, it has been bugging me 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
Maybe some tooltip could be added to explain this (where?).
I'm not sure, but maybe this can be another ?
tooltip that lives in the upper right corner of the summary box. I don't think we should block merging this though.
compare
sitecompare
site
Yeah. tooltip will be good, as currently isn't obvious whats than numbers mean. |
@rylev @nnethercote https://perf.rust-lang.org/compare.html?start=8ebec97e09a89760e5791bbb2ab96e2ebec19931&end=e780264e1e5c1efa6ab76c7b17a9677f16add5e0 this result looks a bit suspicious with the arithmetic mean. I would like to try to compute the geometric mean for this data to see how it would look like, but it's a bit cumbersome to change the code of the live website. Is there a way to download the benchmark results from perf.rlo as a SQlite file for example? |
When some change produces both a lot of improvements and and a lot of regressions, it might be quite difficult to guess the relative size of the improvements/regression just from a glance. The page shows the number of improvements/regressions, but there is no aggregated value of their scales.
What about including a geometric mean of the speedups/slowdowns to provide a quick hint of how did the change impact the benchmarks in general?
I put together a quick PoC just to kickstart a discusion (the visual design should be different and I'm not sure if the implementation is fully correct).