Skip to content

Use a summary table for PR comments #1245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 1, 2022
Merged

Use a summary table for PR comments #1245

merged 1 commit into from
Apr 1, 2022

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Mar 25, 2022

This PR implements a new design for reporting performance summaries on PRs.

It looks something like this:

Regressions 😿
(primary)
Regressions 😿
(secondary)
Improvements 🎉
(primary)
Improvements 🎉
(secondary)
All 😿 🎉
(primary)
count1 2 1 2 1 4
mean2 150.0% 100.0% -62.5% -66.7% 43.8%
max 200.0% 100.0% -75.0% -66.7% 200.0%

Should the table also be used in the weekly triage or not? (will be left for another PR).

Closes: #1214

Footnotes

  1. number of relevant changes

  2. the arithmetic mean of the percent change

@Kobzol Kobzol requested review from rylev and Mark-Simulacrum March 25, 2022 18:27
Copy link
Member

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you're missing a run of cargo fmt to get CI passing.

  1. Should the table also be used in the weekly triage or not?

Yes I think that would be helpful

  1. I don't really like that write_summary_table is a method on ComparisonSummary, this should be a free function I think. Same with write_summary_lines.

Feel free to make this refactor 😊

  1. Currently, primary and secondary benchmarks are not distinguished, because this information is not currently propagated to the performance summaries. Should we add the benchmark configuration to TestResultComparison or pass some Map<BenchmarkName, BenchmarkData> to the summary?

I'd like for this change to get in before we merge if possible. It seems somewhat redundant to put the category in with the TestResultComparison since that data will be the same for all test cases with the same benchmark. So having some BenchMark to Category map probably would work best.

  1. Should columns without any entries be present or not? I.e. should we show 0 entries or just hide the column altogether?

I think we should keep all columns even when they have 0 entries.

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 29, 2022

To avoid complicating the table render function and to keep category handling out of summaries, I changed it so that the render function directly receives two summaries (primary and secondary) and it is up to the caller to generate these. Since the summary can now be empty, I commented the check that returned None from summarize_comparison if the resulting summary would be empty.

One issue is that some columns are split between primary/secondary, but the last one (all relevant changes) contains data for both categories together. I added a TODO comment to the code with possible ways to resolve this, I'll be glad to hear your thoughts.

Since the Category is not exported by collector (this was done intentionally as I remember), categories are handled in a string-like manner, which is not super nice, but we can improve that with some constants PRIMARY/SECONDARY.

Copy link
Member

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Category is not exported by collector (this was done intentionally as I remember)

Why doesn't the collector export Category?

One issue is that some columns are split between primary/secondary

It's a shame that GitHub doesn't support more complex tables. If it did we could have the category be a two column wide column that splits into regressions and improvements. Oh well.

I think it's much more interesting to have the average change for the primary category. I think mixing primary and secondary will make for worse data.

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 30, 2022

Why doesn't the collector export Category?

I checked git history. It was originally in database:
https://github.com/rust-lang/rustc-perf/pull/1181/files#diff-9c399cba4cb055f29cca4a48993625e40ffe47dd9a434ef8de6024d72dcdd1b9R775

Then it was moved in 4009f1e (I don't think that this had a PR). @nnethercote would it be okay if Category was exported from the collector crate?

@nnethercote
Copy link
Contributor

Exporting Category seems fine to me.

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 30, 2022

Exported Category and hooked the summary table function to the GitHub listener.

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 30, 2022

I would leave changing the summary used in triages for another PR, there are some links being generated there and I'm not sure how to make that work with the table. But for PR comments this should be ready for review.

Copy link
Member

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're almost there but there's a few things I'd like addressed.

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 31, 2022

Now that I look at the table, maybe it would be better to group the columns by category and not by direction? primary - imp, primary - reg, secondary - imp, secondary - reg?

@rylev
Copy link
Member

rylev commented Mar 31, 2022

I imagine that folks are generally more interested in regressions (i.e., improvements are great but for PRs that are not explicitly performance related most people just want to know "did this change make the compiler slower?"). With that in mind I think it makes most sense to show regressions first and then improvements (i.e., group by direction)

@Kobzol
Copy link
Contributor Author

Kobzol commented Mar 31, 2022

Oh, I'm quite biased, as most of my PRs so far were explicitly focused on improving performance, so I was mostly looking for these improvements :D But you're right, in the general case regressions should be the first.

Rebased on master to solve conflict and updated things based on review.

Copy link
Member

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I had one nit, but we can merge this first before addressing that if you want to

Copy link
Member

@rylev rylev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@rylev rylev merged commit 5f847d5 into master Apr 1, 2022
@rylev rylev deleted the gh-report-table branch April 1, 2022 07:48
@lqd
Copy link
Member

lqd commented Apr 1, 2022

@Kobzol the report looks strange rust-lang/rust#95171 (comment) but you must have noticed that's your PR :)

maybe a difference in between the header and actual column count ?

@Kobzol
Copy link
Contributor Author

Kobzol commented Apr 1, 2022

Thanks, I'll check it ASAP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improved GitHub message after perf suite run
4 participants