Skip to content

Improve triage readme to better reflect current practices #875

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 9, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 61 additions & 25 deletions triage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ usage.
- rylev
- pnkfelix

Monday evening in North America is a good time to do it because This Week in Rust (see below) is usually
published on Tuesday, US time, and so it means the PR to include the triage
details in TWiR can be merged shortly before publication. This time is also
shortly before the weekly Rust compiler meeting, where the results are looked
at.
Monday evening to Tuesday afternoon in North America is a good time to do it
because This Week in Rust (see below) is usually published on Wednesday, US time,
and so it means the PR to include the triage details in TWiR can be merged shortly
before publication. This time is also shortly before the weekly Rust compiler
meeting, where the results are looked at.

## Instructions
## Generating the Report

First, check the previous triage log entry. Look for responses in PRs, and
follow up on any promised actions. (i.e. nag people!)
Expand All @@ -34,7 +34,9 @@ Use the provided script to automate building the file:
% ./weekly-report.py PARENT > YYYY-MM-DD.md
```

You can also do it manually, starting by viewing the [perf website](https://perf.rust-lang.org).
### Manual Generation

You can also build the report manually, starting by viewing the [perf website](https://perf.rust-lang.org).

- Determine the revision range. The start revision corresponds to the end
revision from the previous log entry. The end revision is the latest measured
Expand All @@ -45,31 +47,67 @@ You can also do it manually, starting by viewing the [perf website](https://perf
see.
- Record the revision range, with a link, in the log entry.

Look for significant changes (regressions or improvements) in the instruction
count graphs.
## Analysis

The following is a list of items you should look for when interpreting performance results.

If you've generated a triage report, you will go through each entry in the report
and verify that it is properly labeled as a regression, improvement, or a mix
of the two. For instance, some entries that are labeled as regressions, are actually
not regressions and have only been labeled as such due to noise.

### Viewing Results

Look for significant changes (regressions or improvements) in the following:
- instruction count
- max rss

When working with graphs:
- Click and drag a region of a graph to zoom in on it. This is useful when data
points are close together.
- Click on a data point to open the "compare" page for that merge.
- Click on a data point to open the "compare" page for that merge. This opens the comparison pages that are linked to in the generated triage report.

To view the code changes:
- Click on the "compare" link at the top of the measurements on that page to
open the page of commits in the merge.
- Alternatively, it may be easier to simply click through to the "compare" page
for every revision in the range (there often aren't that many), rather than
looking at the graphs.

Understanding the comparison page:
- Each benchmark is listed with the min, max and the avg percentage change
(red indicates regressions, green indicates improvements) across the various
benchmarks run (e.g., full, incremental-full, incremental-unchanged, etc.).
- Clicking on a specific benchmark will show the results for each of the various
benchmarks. Clicking on the percentages will open a more specific detail view
of timing for queries run during compilation.

### Interpreting Results

*Warning*: max rss is much more variable than instruction count. Recheck the "Absolute
data" checkbox otherwise the noise becomes unmanageable.

- A change isn't significant unless one or more of the benchmarks changed by at
least 1%.

Easy cases: there is only a single PR in the merge.
#### Dealing with noise

Some benchmarks are noisy. Those that are known to be noisy typically have their results
labeled as such with a `?`. These results can usually be dismissed if they are below 2.5% and are not accompanied by other regressions in other benchmarks.

## Action

### Ping PR Author/Reviewer

Single PR in Merge:
- Add a comment to the PR pointing to the "compare" page (unless someone else
has already done that).
- In the case of a regression, ask the author for a response. If it's a big
regression, consider requesting a backout. It may be worth looking through
the comments to see if any perf CI runs were done, and whether the
regression was expected.
- Add an entry to the triage log. Include the PR title and number, a link to
the PR comment you added mentioning the performance effect, and a link to the
performance results. Include useful details, such as the size of the
regression/improvement, and any promises of follow-up action from authors in
the case of a regression.
regression, consider requesting the author revert their changes. It may
be worth looking through the comments to see if any perf CI runs were done,
and whether the regression was expected.
- If you did not generate the triage report, add an entry.
Include the PR title and number, a link to the PR comment you added mentioning the
performance effect, and a link to the performance results. Include useful details,
such as the size of the regression/improvement, and any promises of follow-up action
from authors in the case of a regression.

Difficult cases: the merge was a rollup of multiple PRs.
- Look through the PRs and try to determine which was the cause. Often you
Expand All @@ -84,9 +122,7 @@ Difficult cases: the merge was a rollup of multiple PRs.
that are likely to affect performance.
- Add an entry to the triage log, as for the easy cases.

Repeat with the `max-rss` graphs. These measurements are much noisier than
`instructions:u`, so only larger changes will be clear; recheck the "Absolute
data" checkbox otherwise the noise becomes unmanageable.
### This Week in Rust

Once finished, file a PR adding a link to the log entry in [This Week in
Rust](https://github.com/emberian/this-week-in-rust/).
Expand Down