Skip to content

Commit 41ee6df

Browse files
committed
rollup merge of #24846: dotdash/fast_cttz8
Currently, LLVM lowers a cttz8 on x86_64 to these instructions: ```asm movzbl %dil, %eax bsfl %eax, %eax movl $32, %ecx cmovnel %eax, %ecx cmpl $32, %ecx movl $8, %eax cmovnel %ecx, %eax ``` To improve the codegen, we can zero extend the 8 bit integer, then set bit 8 and perform a cttz operation on the extended value. That way there's no conditional operation involved at all. This was discovered by this benchmark: https://github.com/Kimundi/long_strings_without_repeats Timings on my box with the current nightly: ``` running 4 tests test bench_cpp_naive_big ... bench: 5479222 ns/iter (+/- 254222) test bench_noop_big ... bench: 571405 ns/iter (+/- 111950) test bench_rust_naive_big ... bench: 7798102 ns/iter (+/- 148841) test bench_rust_unsafe_big ... bench: 6606488 ns/iter (+/- 67529) ``` Timings with the patch applied: ``` running 4 tests test bench_cpp_naive_big ... bench: 5470944 ns/iter (+/- 7109) test bench_noop_big ... bench: 568944 ns/iter (+/- 6895) test bench_rust_naive_big ... bench: 6795901 ns/iter (+/- 43806) test bench_rust_unsafe_big ... bench: 5584879 ns/iter (+/- 5291) ```
2 parents dfb6080 + 36dccec commit 41ee6df

File tree

2 files changed

+14
-6
lines changed

2 files changed

+14
-6
lines changed

src/libcore/num/mod.rs

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -745,7 +745,20 @@ macro_rules! uint_impl {
745745
#[stable(feature = "rust1", since = "1.0.0")]
746746
#[inline]
747747
pub fn trailing_zeros(self) -> u32 {
748-
unsafe { $cttz(self as $ActualT) as u32 }
748+
// As of LLVM 3.6 the codegen for the zero-safe cttz8 intrinsic
749+
// emits two conditional moves on x86_64. By promoting the value to
750+
// u16 and setting bit 8, we get better code without any conditional
751+
// operations.
752+
// FIXME: There's a LLVM patch (http://reviews.llvm.org/D9284)
753+
// pending, remove this workaround once LLVM generates better code
754+
// for cttz8.
755+
unsafe {
756+
if $BITS == 8 {
757+
intrinsics::cttz16(self as u16 | 0x100) as u32
758+
} else {
759+
$cttz(self as $ActualT) as u32
760+
}
761+
}
749762
}
750763

751764
/// Shifts the bits to the left by a specified amount, `n`,

src/test/run-pass/intrinsics-integer.rs

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -109,11 +109,6 @@ pub fn main() {
109109
assert_eq!(cttz32(100), 2);
110110
assert_eq!(cttz64(100), 2);
111111

112-
assert_eq!(cttz8(-1), 0);
113-
assert_eq!(cttz16(-1), 0);
114-
assert_eq!(cttz32(-1), 0);
115-
assert_eq!(cttz64(-1), 0);
116-
117112
assert_eq!(bswap16(0x0A0B), 0x0B0A);
118113
assert_eq!(bswap32(0x0ABBCC0D), 0x0DCCBB0A);
119114
assert_eq!(bswap64(0x0122334455667708), 0x0877665544332201);

0 commit comments

Comments
 (0)