Refactor significance calculations #1003

Mark-Simulacrum · 2021-09-14T20:40:51Z

This moves significance factor calculation to the backend and does some refactoring around that, but doesn't actually change the calculation significantly.

The minor difference is that the significance threshold is calculated as median + ... rather than q3 + ..., which seems more appropriate. Basing the significance threshold at the q3 marker introduces a sort of "partial iqr" and that feels fishy.

This moves significance factor calculation to the backend, and refactors the code surrounding that, but makes no changes yet.

rylev · 2021-09-15T12:59:58Z

site/src/comparison.rs

                .unwrap_or(Self::SIGNIFICANT_RELATIVE_CHANGE_THRESHOLD)
        }
    }

+    /// This is a numeric magnitude of a particular change.
+    fn significance_factor(&self) -> Option<f64> {


What's the reason for this being an optional?

Hm, I think this was leftover from an earlier version I had locally. The reasoning was that we may not always be able to compute a change's magnitude in a reasonable way -- we approximate significance with the 0.002 threshold constants, but there's no such thing really for "how significant" if there's no prior data.

I actually think this may be a little buggy in the sense that I think we should show something like "-" in the UI if we don't have enough historical data to judge the factor, not just scale based on the 0.002 threshold, which would be misleading.

rylev · 2021-09-15T13:02:39Z

site/src/comparison.rs

+    fn significance_threshold(&self) -> f64 {
+        let (q1, median, q3) = self.quartiles();
+
+        // Changes that are IQR_MULTIPLIER away from the median are considered


Can you talk more about why you changed calculation from q3 + IQR to median + IQR? I believe outlier fences are normally calculated from the quartiles and not from the median.

Hm. So my reasoning here was that it felt like using the median is somehow "better" -- adding IQR to the median seems more reasonable than adding it to Q3, which sort of already includes part of the IQR. This was also partially based on my (incorrect) recollection was that my statistics classes did have it based on the median, though since the pages you've linked to do all have q3, I was probably just wrong here :)

In practice our median and q3 are usually pretty close, so this probably has a negligible difference.

Hmm I'm not sure if it feels better to me. Perhaps you could try to explain more why it does?

In any case if it's unlikely to make a huge difference, perhaps we should just stick with the text book definition?

I'm happy to revert this particular change.

I think it mostly felt like we ended up with something like

median + IQR * (some coefficient representing how "right skewed" we are) + IQR * IQR_MULTIPLIER

where the middle term is the Q3-median delta. That felt a little weird for our purposes, where we don't actually care about the right-skew that much, in practice we will always have high right skew since all data is going to be bound to [0, infinity) so there's practically no way to be left skewed.

That makes sense. I don't feel too strongly about this which I guess is why my gut says to fallback to the book mechanism, but I'll leave it to you.

However, can you update the docs where we explain this? https://github.com/rust-lang/rustc-perf/blob/master/docs/comparison-analysis.md#what-makes-a-test-result-significant

I'll just revert this change I think, that seems best. I'll update the constant 1.5 in the docs as well.

Refactor significance calculations

21c3774

This moves significance factor calculation to the backend, and refactors the code surrounding that, but makes no changes yet.

Mark-Simulacrum enabled auto-merge September 14, 2021 20:40

Mark-Simulacrum merged commit 8d3a2e9 into rust-lang:master Sep 14, 2021

Mark-Simulacrum deleted the std-dev branch September 14, 2021 20:57

Mark-Simulacrum restored the std-dev branch September 14, 2021 20:57

Mark-Simulacrum deleted the std-dev branch September 14, 2021 20:57

rylev reviewed Sep 15, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor significance calculations #1003

Refactor significance calculations #1003

Uh oh!

Mark-Simulacrum commented Sep 14, 2021

Uh oh!

rylev Sep 15, 2021

Uh oh!

Mark-Simulacrum Sep 15, 2021

Uh oh!

rylev Sep 15, 2021

Uh oh!

Mark-Simulacrum Sep 15, 2021

Uh oh!

rylev Sep 15, 2021

Uh oh!

Mark-Simulacrum Sep 15, 2021

Uh oh!

rylev Sep 15, 2021

Uh oh!

Mark-Simulacrum Sep 15, 2021

Uh oh!

Mark-Simulacrum Sep 15, 2021

Uh oh!

Uh oh!

Refactor significance calculations #1003

Refactor significance calculations #1003

Uh oh!

Conversation

Mark-Simulacrum commented Sep 14, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!