-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Speed up dec2flt fast path with additional tables. #30639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
r? @aturon (rust_highfive has picked a reviewer for you, use r? to override) |
Part of #30638 |
3d698d1
to
5cd98e0
Compare
@bors: r+ 5cd98e0913f5f6165fb55097ad2861f8cdc3b2fb Nice! |
⌛ Testing commit 5cd98e0 with merge 73129be... |
} else { | ||
Some(T::from_int(f) / fp_to_float(power_of_ten(-e))) | ||
Some(T::from_int(f) / T::short_fast_pow10((-e) as usize)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow is ident as type
really more tightly binding than negation in our grammar? Or did you just find this easier to read?
(My eyes kept folding the two opening parens into just one, leading me to think the cast was applied to the result of the call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simply don't know which binds more tightly. I checked now and it seems -e as usize
would work, but ask me tomorrow and I'll be unsure again. I should just use e.abs()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. If the Playground did not deceive me, e.abs()
should even generate identical code.
💔 Test failed - auto-win-gnu-64-opt |
Failure seems unrelated. I want to fix the style thing @pnkfelix pointed out so no need to retry immediately. |
Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path. Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a [benchmark] comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference. [benchmark]: https://gist.github.com/Veedrac/dbb0c07994bc7882098e
5cd98e0
to
dad1df6
Compare
Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path. Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a [benchmark] comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference. [benchmark]: https://gist.github.com/Veedrac/dbb0c07994bc7882098e
Add tables of small powers of ten used in the fast path. The tables are redundant: We could also use the big, more accurate table and round the value to the correct type (in fact we did just that before this commit). However, the rounding is extra work and slows down the fast path.
Because only very small exponents enter the fast path, the table and thus the space overhead is negligible. Speed-wise, this is a clear win on a benchmark comparing the fast path to a naive, hand-optimized, inaccurate algorithm. Specifically, this change narrows the gap from a roughly 5x difference to a roughly 3.4x difference.