Skip to content

Commit 08c0319

Browse files
authored
Merge pull request #1340 from pnkfelix/triage-2022-05-31
perf triage for this week.
2 parents 863d2be + 60488ad commit 08c0319

File tree

1 file changed

+251
-0
lines changed

1 file changed

+251
-0
lines changed

triage/2022-05-31.md

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
# 2022-05-31 Triage Log
2+
3+
A good week: The regressions were small; some have follow-up PR's in flight to
4+
address them; and we saw a big improvement from PR
5+
[#97345](https://github.com/rust-lang/rust/pull/97345), which adds more fast
6+
paths for quickly exiting comparisons between two types (such as `BitsImpl<M>`
7+
and `BitsImpl<N>` for const integers `M` and `N`). This improved compile-times
8+
for the `bitmaps` benchmark by 50-65% in some cases (including the trunk
9+
`nalgebra`, according to independent investigation from nnethercote). That same
10+
PR had more modest improvements (1% to 2%) to the compile-times for a number of
11+
other crates. Many thanks to lcnr and nnethercote for some excellent work here!
12+
13+
Triage done by **@pnkfelix**.
14+
Revision range: [43d9f385..0a43923a](https://perf.rust-lang.org/?start=43d9f3859e0204e764161ee085a360274b5f3e9a&end=0a43923a86c3b8f11d005884871b152f59b746f7&absolute=false&stat=instructions%3Au)
15+
16+
**Summary**:
17+
18+
| | mean | max | count |
19+
|:----------:|:----:|:---:|:-----:|
20+
| Regressions 😿 <br /> (primary) | N/A | N/A | 0 |
21+
| Regressions 😿 <br /> (secondary) | 0.8% | 0.9% | 3 |
22+
| Improvements 🎉 <br /> (primary) | -4.0% | -65.9% | 227 |
23+
| Improvements 🎉 <br /> (secondary) | -2.0% | -7.7% | 217 |
24+
| All 😿🎉 (primary) | -4.0% | -65.9% | 227 |
25+
26+
27+
3 Regressions, 1 Improvements, 9 Mixed; 0 of them in rollups
28+
59 artifact comparisons made in total
29+
30+
#### Regressions
31+
32+
Proc macro tweaks [#97004](https://github.com/rust-lang/rust/pull/97004) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=9a42c6509d399fd205917ebce474b31315c5d3e9&end=f558990814bb43cfb67db321b299dfdf275663e3&stat=instructions:u)
33+
34+
| | mean | max | count |
35+
|:----------:|:----:|:---:|:-----:|
36+
| Regressions 😿 <br /> (primary) | 0.2% | 0.2% | 1 |
37+
| Regressions 😿 <br /> (secondary) | 0.6% | 1.5% | 10 |
38+
| Improvements 🎉 <br /> (primary) | N/A | N/A | 0 |
39+
| Improvements 🎉 <br /> (secondary) | N/A | N/A | 0 |
40+
| All 😿🎉 (primary) | 0.2% | 0.2% | 1 |
41+
42+
* aparently making a `Buffer<T>` non-generic (its solely instantiated at `u8`) caused a performance regression.
43+
* there is ongoing follow-up work to address the regression, either by making the buffer generic again, or by adding `#[inline]` annotations.
44+
* see e.g. [PR 97539](https://github.com/rust-lang/rust/pull/97539)
45+
46+
47+
rustdoc: include impl generics / self in search index [#96652](https://github.com/rust-lang/rust/pull/96652) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=84288ed6d5307ed44a0f78e2f1ee55fbadf4e978&end=0acc4a35853215a6f9388ab61455ced309711003&stat=instructions:u)
48+
49+
| | mean | max | count |
50+
|:----------:|:----:|:---:|:-----:|
51+
| Regressions 😿 <br /> (primary) | 0.5% | 1.0% | 3 |
52+
| Regressions 😿 <br /> (secondary) | 1.1% | 1.1% | 3 |
53+
| Improvements 🎉 <br /> (primary) | N/A | N/A | 0 |
54+
| Improvements 🎉 <br /> (secondary) | N/A | N/A | 0 |
55+
| All 😿🎉 (primary) | 0.5% | 1.0% | 3 |
56+
57+
* This regressed doc-generation for a few primary benchmarks, but I think that might be a inherent cost of a change like this. I marked it as triaged based on that assumption.
58+
59+
Move things to `rustc_type_ir` [#97287](https://github.com/rust-lang/rust/pull/97287) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=303d916867040e269b54adf3cfc7f5c903dc26ff&end=0f06824013761ed6829887019033f1001e68f623&stat=instructions:u)
60+
61+
| | mean | max | count |
62+
|:----------:|:----:|:---:|:-----:|
63+
| Regressions 😿 <br /> (primary) | 0.5% | 0.7% | 9 |
64+
| Regressions 😿 <br /> (secondary) | 1.1% | 1.1% | 3 |
65+
| Improvements 🎉 <br /> (primary) | N/A | N/A | 0 |
66+
| Improvements 🎉 <br /> (secondary) | N/A | N/A | 0 |
67+
| All 😿🎉 (primary) | 0.5% | 0.7% | 9 |
68+
69+
* During development, this PR was identified as regressing one primary benchmark, `bitmaps` (in a variety of contexts), and the PR author [said](https://github.com/rust-lang/rust/pull/97287#issuecomment-1139917174) better to take that perf hit and deal with it later.
70+
* In the PR that landed, the number of regressing variants was smaller, but the affected set of benchmarks slightly larger: [both `bitmaps` and `unicode-normalization`](https://perf.rust-lang.org/compare.html?start=303d916867040e269b54adf3cfc7f5c903dc26ff&end=0f06824013761ed6829887019033f1001e68f623&stat=instructions:u) are affected, regressing by 0.35% to 0.70%.
71+
* My very brief inpsection of the flamegraphs ([old](https://perf.rust-lang.org/perf/processed-self-profile?commit=303d916867040e269b54adf3cfc7f5c903dc26ff&benchmark=bitmaps-3.1.0-check&scenario=full&type=flamegraph), [new](https://perf.rust-lang.org/perf/processed-self-profile?commit=0f06824013761ed6829887019033f1001e68f623&benchmark=bitmaps-3.1.0-check&scenario=full&type=flamegraph)) didn't show any smoking guns.
72+
* At this point I think we can just live with this performance hit.
73+
74+
#### Improvements
75+
76+
Add suggestion for relaxing static lifetime bounds on dyn trait impls in NLL [#97284](https://github.com/rust-lang/rust/pull/97284) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=764b8615e9149431d8790e3c07cb663642fe393d&end=ed76b773b57cf0aa48ec4e2fc6d6a3f7a9079491&stat=instructions:u)
77+
78+
| | mean | max | count |
79+
|:----------:|:----:|:---:|:-----:|
80+
| Regressions 😿 <br /> (primary) | N/A | N/A | 0 |
81+
| Regressions 😿 <br /> (secondary) | N/A | N/A | 0 |
82+
| Improvements 🎉 <br /> (primary) | -0.2% | -0.2% | 1 |
83+
| Improvements 🎉 <br /> (secondary) | -5.3% | -5.9% | 6 |
84+
| All 😿🎉 (primary) | -0.2% | -0.2% | 1 |
85+
86+
* This managed to improve performance (slightly), probably because it [restructured](https://github.com/rust-lang/rust/pull/97284/commits/3c6c8d5a8dbf4db20450ed5793ef35f29c13466c#r882897413) the `CallArgument` and that had fallout that ended up being positive overall (despite our intuition being that it would hurt performance due to the increase in size).
87+
* It might be nice to confirm that hypothesis independently (by isolating just that structural change and confirming that it has a similar effect on performance here) ...
88+
* ... but, the improvements are essentially isolated to just the secondary wg-grammar benchmark, so its not really worth too much digging, except if we think it might reveal other structural changes we should make elsewhere.
89+
90+
#### Mixed
91+
92+
add a deep fast_reject routine [#97345](https://github.com/rust-lang/rust/pull/97345) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=cbdce423201f1b155c46f3ec690a644cf3b4ba53&end=4a99c5f504ab65a0fd9d60f515811e1d9cff8c0a&stat=instructions:u)
93+
94+
| | mean | max | count |
95+
|:----------:|:----:|:---:|:-----:|
96+
| Regressions 😿 <br /> (primary) | N/A | N/A | 0 |
97+
| Regressions 😿 <br /> (secondary) | 0.9% | 1.6% | 10 |
98+
| Improvements 🎉 <br /> (primary) | -11.0% | -65.8% | 45 |
99+
| Improvements 🎉 <br /> (secondary) | -0.5% | -0.9% | 12 |
100+
| All 😿🎉 (primary) | -11.0% | -65.8% | 45 |
101+
102+
* this was great, as noted in summary.
103+
104+
105+
Move various checks to typeck so them failing causes the typeck result to get tainted [#96046](https://github.com/rust-lang/rust/pull/96046) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=46147119ec545045948bc799581d93edd3b1617b&end=56fd680cf9226ab424f88d4e3b43c5e088d17f19&stat=instructions:u)
106+
107+
| | mean | max | count |
108+
|:----------:|:----:|:---:|:-----:|
109+
| Regressions 😿 <br /> (primary) | N/A | N/A | 0 |
110+
| Regressions 😿 <br /> (secondary) | 0.9% | 1.1% | 4 |
111+
| Improvements 🎉 <br /> (primary) | -0.3% | -0.4% | 4 |
112+
| Improvements 🎉 <br /> (secondary) | -0.3% | -0.3% | 24 |
113+
| All 😿🎉 (primary) | -0.3% | -0.4% | 4 |
114+
115+
* This had some small improvements to primary benchmarks.
116+
* The CTFE stress test regressed, but I assume that was expected since this was a change to the CTFE engine (to address some ICE's).
117+
118+
Update jemalloc to v5.3 [#96790](https://github.com/rust-lang/rust/pull/96790) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=56fd680cf9226ab424f88d4e3b43c5e088d17f19&end=ebbcbfc236ced21d5e6a92269edb704692ff26b8&stat=instructions:u)
119+
120+
| | mean | max | count |
121+
|:----------:|:----:|:---:|:-----:|
122+
| Regressions 😿 <br /> (primary) | N/A | N/A | 0 |
123+
| Regressions 😿 <br /> (secondary) | 0.3% | 0.3% | 1 |
124+
| Improvements 🎉 <br /> (primary) | -0.9% | -6.2% | 212 |
125+
| Improvements 🎉 <br /> (secondary) | -1.1% | -3.2% | 191 |
126+
| All 😿🎉 (primary) | -0.9% | -6.2% | 212 |
127+
128+
* This had various improvements to our primary benchmarks' instruction counts.
129+
* The other big thing to observe: the max-rss was improved in several primary
130+
benchmarks (by 1% to 6%), and some secondary benchmarks saw even more
131+
significant improvements to their max-rss.
132+
133+
Split dead store elimination off dest prop [#97158](https://github.com/rust-lang/rust/pull/97158) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=19abca1172ae10d5c084b6c3013d92680f92dd8b&end=68314177e70017c08f6cdf295631bb508f9f85bc&stat=instructions:u)
134+
135+
| | mean | max | count |
136+
|:----------:|:----:|:---:|:-----:|
137+
| Regressions 😿 <br /> (primary) | 0.5% | 1.3% | 15 |
138+
| Regressions 😿 <br /> (secondary) | 0.6% | 2.3% | 12 |
139+
| Improvements 🎉 <br /> (primary) | -0.4% | -1.9% | 50 |
140+
| Improvements 🎉 <br /> (secondary) | -0.6% | -1.3% | 33 |
141+
| All 😿🎉 (primary) | -0.2% | -1.9% | 65 |
142+
143+
* The changes here were investigated by [the PR author](https://github.com/rust-lang/rust/pull/97158#issuecomment-1140550402).
144+
* Marking as triaged based on their investigation.
145+
146+
Try to cache region_scope_tree as a query [#97383](https://github.com/rust-lang/rust/pull/97383) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=68314177e70017c08f6cdf295631bb508f9f85bc&end=4f39fb1f34d4bd25d9ce96afe7b2d109f073e286&stat=instructions:u)
147+
148+
| | mean | max | count |
149+
|:----------:|:----:|:---:|:-----:|
150+
| Regressions 😿 <br /> (primary) | N/A | N/A | 0 |
151+
| Regressions 😿 <br /> (secondary) | 1.0% | 1.0% | 3 |
152+
| Improvements 🎉 <br /> (primary) | -1.7% | -4.7% | 98 |
153+
| Improvements 🎉 <br /> (secondary) | -2.4% | -7.7% | 43 |
154+
| All 😿🎉 (primary) | -1.7% | -4.7% | 98 |
155+
156+
* This was a targeted PR to address the regressions introduced by PR [#95563](https://github.com/rust-lang/rust/pull/95563) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=acfd327fd4e3a302ebb0a077f422a527a7935333&end=653463731a7f01f519cf85f444869def27f00395&stat=instructions:u).
157+
* It seems to succeed at recovering the instruction-count performance lost in that PR.
158+
* It doesn't *quite* recover all of the max-rss lost
159+
(PR [#97383 improves max-rss by about 3.75%](https://perf.rust-lang.org/compare.html?start=6ac8adad1f7d733b5b97d1df4e7f96e73a46db42&end=209f91e1787d0d29a0e566fa93f35d52e60ea84a&stat=max-rss)
160+
while PR [#95563 degraded max-rss by about 4%](https://perf.rust-lang.org/compare.html?start=acfd327fd4e3a302ebb0a077f422a527a7935333&end=653463731a7f01f519cf85f444869def27f00395&stat=max-rss)),
161+
but its close enough to satisfy our needs.
162+
163+
proc_macro: don't pass a client-side function pointer through the server. [#97461](https://github.com/rust-lang/rust/pull/97461) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=4f39fb1f34d4bd25d9ce96afe7b2d109f073e286&end=116201eefebcf45074ae232377e7145f5fbb704b&stat=instructions:u)
164+
165+
| | mean | max | count |
166+
|:----------:|:----:|:---:|:-----:|
167+
| Regressions 😿 <br /> (primary) | 0.5% | 0.5% | 2 |
168+
| Regressions 😿 <br /> (secondary) | N/A | N/A | 0 |
169+
| Improvements 🎉 <br /> (primary) | -0.4% | -0.8% | 11 |
170+
| Improvements 🎉 <br /> (secondary) | -2.6% | -5.4% | 10 |
171+
| All 😿🎉 (primary) | -0.3% | -0.8% | 13 |
172+
173+
* On the primary benchmarks, this mostly yielded small improvements; the one exception was `serde_derive` (debug), which regressed by ~0.5% for the full and incr-full variants.
174+
* Looking at the flamegraphs for the before and after commits, and at the [table at bottom of details page](https://perf.rust-lang.org/detailed-query.html?commit=116201eefebcf45074ae232377e7145f5fbb704b&base_commit=4f39fb1f34d4bd25d9ce96afe7b2d109f073e286&benchmark=serde_derive-1.0.136-debug&scenario=full), it seems like the instruction-count regression is in `codegen_module`
175+
* Looking at the history of serde_derive-debug (massively zoomed in on the "Percent Delta from First" Graph kind), it seems reasonable to think that something *did* happen on this PR.
176+
* .... but I also don't really think its a big enough regression to be worth tearing our hair out over. This is a (smallish) win overall, and even for serde_derive-debug, it is a small regression in the context of much larger wins, so overall the trajectory is good.
177+
* Marked as triaged.
178+
179+
Replace `#[default_method_body_is_const]` with `#[const_trait]` [#96964](https://github.com/rust-lang/rust/pull/96964) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=855fc022fe879f4e3493a024f9c6b981d6317612&end=5c780b98d10f48d6255cf2deb2643194b9221c02&stat=instructions:u)
180+
181+
| | mean | max | count |
182+
|:----------:|:----:|:---:|:-----:|
183+
| Regressions 😿 <br /> (primary) | 0.2% | 0.3% | 9 |
184+
| Regressions 😿 <br /> (secondary) | 0.2% | 0.3% | 3 |
185+
| Improvements 🎉 <br /> (primary) | N/A | N/A | 0 |
186+
| Improvements 🎉 <br /> (secondary) | -1.1% | -1.1% | 3 |
187+
| All 😿🎉 (primary) | 0.2% | 0.3% | 9 |
188+
189+
* The primary regressions are mostly in variants of stm32f4, with a two serde and one syn thrown in for good measure.
190+
* The improvements are solely to ctfe stress test, which I guess makes sense given the PR?
191+
* Anyway, the regressions seem minor, and they are contained to an unstable feature that the stdlib is using.
192+
* Marking as triaged.
193+
194+
improve format impl for literals [#97480](https://github.com/rust-lang/rust/pull/97480) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=e810f750a2a407f9caeabba39059578e844add14&end=4a8d2e3856c0c75c71998b6c85937203839b946d&stat=instructions:u)
195+
196+
| | mean | max | count |
197+
|:----------:|:----:|:---:|:-----:|
198+
| Regressions 😿 <br /> (primary) | 0.4% | 0.5% | 4 |
199+
| Regressions 😿 <br /> (secondary) | N/A | N/A | 0 |
200+
| Improvements 🎉 <br /> (primary) | N/A | N/A | 0 |
201+
| Improvements 🎉 <br /> (secondary) | -1.5% | -1.5% | 1 |
202+
| All 😿🎉 (primary) | 0.4% | 0.5% | 4 |
203+
204+
* This is adding a fast-path so that `format!("literal")` will compile into the same code as `"literal".to_owned()`.
205+
* The primary regression is solely contained to `bitmaps`.
206+
* Its possible that the regression to `bitmaps` is due to `format!("literal")` being totally unused in that code; all instances of `format!` there take an additional argument.
207+
So its *possible* that the extra code to check about whether to use the fast-path is slowing things down there.
208+
* But I personally don't believe that explanation here: Unless I'm misunderstanding the code, there is some amount of macro-expansion into multiple instances of `format!`, but
209+
most of the expanded code is going to be dominated by all the `impl` blocks, not the relatively few `format!` instances. (Unless I massively misunderstand how the macros and/or codegen and/or inlining end up linking up here.)
210+
* So: I don't believe the best hypothesis I have for what is happening here.
211+
* But I also do not think the regression here is large enough to warrant further investigation.
212+
* Marking as triaged.
213+
214+
errors: simplify referring to fluent attributes [#97357](https://github.com/rust-lang/rust/pull/97357) [(Comparison Link)](https://perf.rust-lang.org/compare.html?start=c35035cefc709abddabfb28ecc6a326458d46ce2&end=7be9ec27652f2c3b820d341158b0e005f42e248e&stat=instructions:u)
215+
216+
| | mean | max | count |
217+
|:----------:|:----:|:---:|:-----:|
218+
| Regressions 😿 <br /> (primary) | N/A | N/A | 0 |
219+
| Regressions 😿 <br /> (secondary) | 0.4% | 0.6% | 9 |
220+
| Improvements 🎉 <br /> (primary) | N/A | N/A | 0 |
221+
| Improvements 🎉 <br /> (secondary) | -0.4% | -0.5% | 4 |
222+
| All 😿🎉 (primary) | N/A | N/A | 0 |
223+
224+
* There were some regressions here, but they are few, minor, and contained solely to secondary benchmarks (specifically projection-caching and wg-grammar). Marking as triaged.
225+
226+
#### Untriaged Pull Requests
227+
228+
- [#97019 Transition to valtrees pt1](https://github.com/rust-lang/rust/pull/97019)
229+
- [#97004 Proc macro tweaks](https://github.com/rust-lang/rust/pull/97004)
230+
- [#96883 Add EarlyBinder](https://github.com/rust-lang/rust/pull/96883)
231+
- [#96825 Retire `ItemLikeVisitor` trait](https://github.com/rust-lang/rust/pull/96825)
232+
- [#96652 rustdoc: include impl generics / self in search index](https://github.com/rust-lang/rust/pull/96652)
233+
- [#96010 Implement `core::ptr::Unique` on top of `NonNull`](https://github.com/rust-lang/rust/pull/96010)
234+
- [#95990 Rollup of 7 pull requests](https://github.com/rust-lang/rust/pull/95990)
235+
- [#95899 rustc_metadata: Do not encode unnecessary module children](https://github.com/rust-lang/rust/pull/95899)
236+
- [#95893 Respect -Z verify-llvm-ir and other flags that add extra passes when combined with -C no-prepopulate-passes in the new LLVM Pass Manager.](https://github.com/rust-lang/rust/pull/95893)
237+
- [#95835 Rollup of 8 pull requests](https://github.com/rust-lang/rust/pull/95835)
238+
- [#95794 `parse_tt`: a few more tweaks](https://github.com/rust-lang/rust/pull/95794)
239+
- [#95742 Rollup of 6 pull requests](https://github.com/rust-lang/rust/pull/95742)
240+
- [#95715 Shrink `Nonterminal`](https://github.com/rust-lang/rust/pull/95715)
241+
- [#95706 rustdoc: Early doc link resolution fixes and refactorings](https://github.com/rust-lang/rust/pull/95706)
242+
- [#95702 Rollup of 8 pull requests](https://github.com/rust-lang/rust/pull/95702)
243+
- [#95667 Rollup of 5 pull requests](https://github.com/rust-lang/rust/pull/95667)
244+
- [#95662 Rollup of 7 pull requests](https://github.com/rust-lang/rust/pull/95662)
245+
- [#95645 Fix intra doc link ICE when trying to get traits in scope for primitive](https://github.com/rust-lang/rust/pull/95645)
246+
- [#95604 Generate synthetic object file to ensure all exported and used symbols participate in the linking](https://github.com/rust-lang/rust/pull/95604)
247+
- [#95563 Move the extended lifetime resolution into typeck context](https://github.com/rust-lang/rust/pull/95563)
248+
- [#95399 Faster parsing for lower numbers for radix up to 16 (cont.)](https://github.com/rust-lang/rust/pull/95399)
249+
- [#95250 Rollup of 5 pull requests](https://github.com/rust-lang/rust/pull/95250)
250+
- [#94824 Rollup of 7 pull requests](https://github.com/rust-lang/rust/pull/94824)
251+
- [#94814 Rollup of 8 pull requests](https://github.com/rust-lang/rust/pull/94814)

0 commit comments

Comments
 (0)