Merge pull request #875 from rylev/improve-triage-readme

Mark-Simulacrum · web-flow · commit 5cb458a36a90 · 2021-06-09T10:58:29.000-04:00
Improve triage readme to better reflect current practices
diff --git a/triage/README.md b/triage/README.md
@@ -10,13 +10,13 @@ usage.
 - rylev
 - pnkfelix
 
-Monday evening in North America is a good time to do it because This Week in Rust (see below) is usually
-published on Tuesday, US time, and so it means the PR to include the triage
-details in TWiR can be merged shortly before publication. This time is also
-shortly before the weekly Rust compiler meeting, where the results are looked
-at.
+Monday evening to Tuesday afternoon in North America is a good time to do it 
+because This Week in Rust (see below) is usually published on Wednesday, US time, 
+and so it means the PR to include the triage details in TWiR can be merged shortly 
+before publication. This time is also shortly before the weekly Rust compiler 
+meeting, where the results are looked at.
 
-## Instructions
+## Generating the Report 
 
 First, check the previous triage log entry. Look for responses in PRs, and
 follow up on any promised actions. (i.e. nag people!)
@@ -34,7 +34,9 @@ Use the provided script to automate building the file:
 % ./weekly-report.py PARENT > YYYY-MM-DD.md
 ```
 
-You can also do it manually, starting by viewing the [perf website](https://perf.rust-lang.org).
+### Manual Generation 
+
+You can also build the report manually, starting by viewing the [perf website](https://perf.rust-lang.org).
 
 - Determine the revision range. The start revision corresponds to the end
   revision from the previous log entry. The end revision is the latest measured
@@ -45,31 +47,67 @@ You can also do it manually, starting by viewing the [perf website](https://perf
   see.
 - Record the revision range, with a link, in the log entry.
 
-Look for significant changes (regressions or improvements) in the instruction
-count graphs.
+## Analysis
+
+The following is a list of items you should look for when interpreting performance results. 
+
+If you've generated a triage report, you will go through each entry in the report
+and verify that it is properly labeled as a regression, improvement, or a mix
+of the two. For instance, some entries that are labeled as regressions, are actually 
+not regressions and have only been labeled as such due to noise.
+
+### Viewing Results
+
+Look for significant changes (regressions or improvements) in the following:
+- instruction count
+- max rss
+
+When working with graphs: 
 - Click and drag a region of a graph to zoom in on it. This is useful when data
   points are close together.
-- Click on a data point to open the "compare" page for that merge.
+- Click on a data point to open the "compare" page for that merge. This opens the comparison pages that are linked to in the generated triage report.
+
+To view the code changes:
 - Click on the "compare" link at the top of the measurements on that page to
   open the page of commits in the merge.
-- Alternatively, it may be easier to simply click through to the "compare" page
-  for every revision in the range (there often aren't that many), rather than
-  looking at the graphs.
+
+Understanding the comparison page:
+- Each benchmark is listed with the min, max and the avg percentage change 
+  (red indicates regressions, green indicates improvements) across the various 
+  benchmarks run (e.g., full, incremental-full, incremental-unchanged, etc.).
+- Clicking on a specific benchmark will show the results for each of the various
+  benchmarks. Clicking on the percentages will open a more specific detail view 
+  of timing for queries run during compilation.
+
+### Interpreting Results
+
+*Warning*: max rss is much more variable than instruction count. Recheck the "Absolute
+data" checkbox otherwise the noise becomes unmanageable.
+
 - A change isn't significant unless one or more of the benchmarks changed by at
   least 1%.
 
-Easy cases: there is only a single PR in the merge.
+#### Dealing with noise
+
+Some benchmarks are noisy. Those that are known to be noisy typically have their results 
+labeled as such with a `?`. These results can usually be dismissed if they are below 2.5% and are not accompanied by other regressions in other benchmarks. 
+
+## Action
+
+### Ping PR Author/Reviewer 
+
+Single PR in Merge:
 - Add a comment to the PR pointing to the "compare" page (unless someone else
   has already done that).
 - In the case of a regression, ask the author for a response. If it's a big
-  regression, consider requesting a backout. It may be worth looking through
-  the comments to see if any perf CI runs were done, and whether the
-  regression was expected.
-- Add an entry to the triage log. Include the PR title and number, a link to
-  the PR comment you added mentioning the performance effect, and a link to the
-  performance results. Include useful details, such as the size of the
-  regression/improvement, and any promises of follow-up action from authors in
-  the case of a regression.
+  regression, consider requesting the author revert their changes. It may 
+  be worth looking through the comments to see if any perf CI runs were done, 
+  and whether the regression was expected.
+- If you did not generate the triage report, add an entry. 
+  Include the PR title and number, a link to the PR comment you added mentioning the 
+  performance effect, and a link to the performance results. Include useful details, 
+  such as the size of the regression/improvement, and any promises of follow-up action 
+  from authors in the case of a regression.
 
 Difficult cases: the merge was a rollup of multiple PRs.
 - Look through the PRs and try to determine which was the cause. Often you
@@ -84,9 +122,7 @@ Difficult cases: the merge was a rollup of multiple PRs.
   that are likely to affect performance.
 - Add an entry to the triage log, as for the easy cases.
 
-Repeat with the `max-rss` graphs. These measurements are much noisier than
-`instructions:u`, so only larger changes will be clear; recheck the "Absolute
-data" checkbox otherwise the noise becomes unmanageable.
+### This Week in Rust 
 
 Once finished, file a PR adding a link to the log entry in [This Week in
 Rust](https://github.com/emberian/this-week-in-rust/).