Skip to content

Speed up dec2flt fast path with additional tables. #30639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 13, 2016

Conversation

hanna-kruppe
Copy link
Contributor

Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path.

Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a benchmark comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference.

@rust-highfive
Copy link
Contributor

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@hanna-kruppe
Copy link
Contributor Author

Part of #30638

@hanna-kruppe hanna-kruppe force-pushed the dec2flt-fastpath-tables branch 2 times, most recently from 3d698d1 to 5cd98e0 Compare December 31, 2015 20:31
@alexcrichton
Copy link
Member

@bors: r+ 5cd98e0913f5f6165fb55097ad2861f8cdc3b2fb

Nice!

@bors
Copy link
Collaborator

bors commented Jan 12, 2016

⌛ Testing commit 5cd98e0 with merge 73129be...

} else {
Some(T::from_int(f) / fp_to_float(power_of_ten(-e)))
Some(T::from_int(f) / T::short_fast_pow10((-e) as usize))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow is ident as type really more tightly binding than negation in our grammar? Or did you just find this easier to read?

(My eyes kept folding the two opening parens into just one, leading me to think the cast was applied to the result of the call)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simply don't know which binds more tightly. I checked now and it seems -e as usize would work, but ask me tomorrow and I'll be unsure again. I should just use e.abs().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. If the Playground did not deceive me, e.abs() should even generate identical code.

@bors
Copy link
Collaborator

bors commented Jan 12, 2016

💔 Test failed - auto-win-gnu-64-opt

@hanna-kruppe
Copy link
Contributor Author

Failure seems unrelated. I want to fix the style thing @pnkfelix pointed out so no need to retry immediately.

Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path.

Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a [benchmark] comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference.

[benchmark]: https://gist.github.com/Veedrac/dbb0c07994bc7882098e
@hanna-kruppe hanna-kruppe force-pushed the dec2flt-fastpath-tables branch from 5cd98e0 to dad1df6 Compare January 12, 2016 21:28
@alexcrichton
Copy link
Member

@bors: r+ dad1df6

@bors
Copy link
Collaborator

bors commented Jan 13, 2016

⌛ Testing commit dad1df6 with merge f1bcfdd...

bors added a commit that referenced this pull request Jan 13, 2016
Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path.

Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a [benchmark] comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference.

[benchmark]: https://gist.github.com/Veedrac/dbb0c07994bc7882098e
@bors bors merged commit dad1df6 into rust-lang:master Jan 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants