Skip to content

Violin span fix #2650

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 22, 2018
Merged

Violin span fix #2650

merged 6 commits into from
May 22, 2018

Conversation

etpinard
Copy link
Contributor

@etpinard etpinard commented May 22, 2018

Fixes #2631 and supersedes #2552

This PR uses @joshua-gould's 9329a4d where the set bandwidth value is used to determine the number of pts in the kde estimate (as opposed to the min of the set bandwidth and the Silverman's rule of thumb).

Moreover, this PR uses Scott's rule of thumb when the Silverman value yields too many kde points. This happens when the IQR is not a very good estimate of the distribution's spread like in the data of #2552 (comment).

cc @alexcjohnson

@etpinard etpinard added bug something broken status: reviewable labels May 22, 2018
@alexcjohnson
Copy link
Collaborator

I still have two concerns about this:

  • The user can still lock things up with a custom bandwidth that's sufficiently small (and potentially even automatically, say you provide a million zeros and one one? That might still be OK though...) - we need some absolute limit.
  • A sharp transition from Silverman to Scott means that some distributions that are actually narrower will end up looking wider. In your example say you tweaked it to be just on the Silverman side of the limit: you'd get 1e5 bandwidths in the span, then narrow it a little and it would pop out to essentially what's in your test image, looks like ~50 bandwidths in the span. I guess I don't mind if we reach a point where distribution width stops affecting bandwidth, but not if the dependence goes the wrong direction.

What if the automatic bandwidth just used Silverman but limited it to > (max - min)/100 or something? Maybe /100 is too wide still? /1000? I think a narrower result would do a better job showing the shape of this particular distribution anyhow, don't worry about matching Seaborn. And then the 1e5 limit would constrain all cases, manual or automatic. That's far narrower than one pixel, until you've zoomed waaaay in.

- to avoid "sharp" transitions from Silverman to Scott value
- make custom bandwidth value use that same fallback when "too small"
- rename "scott-rule" mock to "bandwidth-edge-cases"
@etpinard
Copy link
Contributor Author

What if the automatic bandwidth just used Silverman but limited it to > (max - min)/100 or something?

Going with /100 in -> 36cc37d

// to not blow up kde computations.
return ((cdi.max - cdi.min) / bw) < 1e5 ?
bw :
(cdi.max - cdi.min) / 100;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, that's still going to jump from 1/1e5 up to 1/100. I had in mind something like:

if(trace.bandwidth) {
    return Math.max(trace.bandwidth, span / 1e5);
}
else {
    return Math.max(silvermanValue, span / 100);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. My bad -> c554070

etpinard added 2 commits May 22, 2018 16:43
.. this time to "really" make the transition smooth
(cdi.max - cdi.min) / 100;

if(trace.bandwidth) {
return Math.max(trace.bandwidth, span / 1e4);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From slack, for the record:

@etpinard: limiting custom bandwidth at (span / 1e5) is still pretty slow. ok if I increase it to span / 1e4 ?

Me: sure, that’s still more than 1 bw/px. I’m not quite so worried about it though, at that point the user has asked for something explicitly ridiculous… but I guess we don’t want our mock to take forever to render!

@alexcjohnson
Copy link
Collaborator

Looks great! 💃

@etpinard etpinard merged commit 857cd05 into master May 22, 2018
@etpinard etpinard deleted the violin-span-fix branch May 22, 2018 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something broken
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants