-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[InstCombine] Canonicalize Bit Testing by Shifting to Bit 0 #101838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstCombine] Canonicalize Bit Testing by Shifting to Bit 0 #101838
Conversation
Implement a new transformation that fold the bit-testing expression (icmp ne (and (lshr V B) 1) 0) to (icmp ne (and V (shl 1 B)) 0) for constant V. This rule already existed for non-constant V and constants other than 1; this restriction to non-constant V has been added in commit c3b2111 to fix an infinite loop. Avoid the infinite loop by allowing constant V only if the shift instruction is an lshr and the constant is 1. Also fold the negated variant of the LHS. This transformation necessitates an adaption of existing tests in `icmp-and-shift.ll` and `load-cmp.ll`. One test in `icmp-and-shift.ll`, which previously was a negative test, now gets folded. Rename it to indicate that it is a positive test. Alive proof: https://alive2.llvm.org/ce/z/vcJJTx Relates to issue llvm#86813.
@llvm/pr-subscribers-llvm-transforms Author: None (mskamp) ChangesImplement a new transformation that fold the bit-testing expression (icmp ne (and (lshr V B) 1) 0) to (icmp ne (and V (shl 1 B)) 0) for constant V. This rule already existed for non-constant V and constants other than 1; this restriction to non-constant V has been added in commit c3b2111 to fix an infinite loop. Avoid the infinite loop by allowing constant V only if the shift instruction is an lshr and the constant is 1. Also fold the negated variant of the LHS. This transformation necessitates an adaption of existing tests in Alive proof: https://alive2.llvm.org/ce/z/vcJJTx Relates to issue #86813. Full diff: https://github.com/llvm/llvm-project/pull/101838.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 3b6df2760ecc2..66809dbf0cbd5 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -1725,7 +1725,8 @@ Instruction *InstCombinerImpl::foldICmpAndShift(ICmpInst &Cmp,
// preferable because it allows the C2 << Y expression to be hoisted out of a
// loop if Y is invariant and X is not.
if (Shift->hasOneUse() && C1.isZero() && Cmp.isEquality() &&
- !Shift->isArithmeticShift() && !isa<Constant>(Shift->getOperand(0))) {
+ !Shift->isArithmeticShift() &&
+ ((!IsShl && C2.isOne()) || !isa<Constant>(Shift->getOperand(0)))) {
// Compute C2 << Y.
Value *NewShift =
IsShl ? Builder.CreateLShr(And->getOperand(1), Shift->getOperand(1))
diff --git a/llvm/test/Transforms/InstCombine/icmp-and-shift.ll b/llvm/test/Transforms/InstCombine/icmp-and-shift.ll
index 08d23e84c3960..65b9749ffc1bf 100644
--- a/llvm/test/Transforms/InstCombine/icmp-and-shift.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-and-shift.ll
@@ -404,11 +404,10 @@ define <2 x i32> @icmp_ne_and_pow2_lshr_pow2_vec(<2 x i32> %0) {
ret <2 x i32> %conv
}
-define i32 @icmp_eq_and1_lshr_pow2_negative1(i32 %0) {
-; CHECK-LABEL: @icmp_eq_and1_lshr_pow2_negative1(
-; CHECK-NEXT: [[LSHR:%.*]] = lshr i32 7, [[TMP0:%.*]]
-; CHECK-NEXT: [[AND:%.*]] = and i32 [[LSHR]], 1
-; CHECK-NEXT: [[CONV:%.*]] = xor i32 [[AND]], 1
+define i32 @icmp_eq_and1_lshr_pow2_minus_one(i32 %0) {
+; CHECK-LABEL: @icmp_eq_and1_lshr_pow2_minus_one(
+; CHECK-NEXT: [[CMP:%.*]] = icmp ugt i32 [[TMP0:%.*]], 2
+; CHECK-NEXT: [[CONV:%.*]] = zext i1 [[CMP]] to i32
; CHECK-NEXT: ret i32 [[CONV]]
;
%lshr = lshr i32 7, %0
@@ -606,3 +605,136 @@ define i1 @fold_ne_rhs_fail_shift_not_1s(i8 %x, i8 %yy) {
%r = icmp ne i8 %and, 0
ret i1 %r
}
+
+define i1 @test_shr_and_1_ne_0(i32 %a, i32 %b) {
+; CHECK-LABEL: @test_shr_and_1_ne_0(
+; CHECK-NEXT: [[TMP1:%.*]] = shl nuw i32 1, [[B:%.*]]
+; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], [[A:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = icmp ne i32 [[TMP2]], 0
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %shr = lshr i32 %a, %b
+ %and = and i32 %shr, 1
+ %cmp = icmp ne i32 %and, 0
+ ret i1 %cmp
+}
+
+define i1 @test_const_shr_and_1_ne_0(i32 %b) {
+; CHECK-LABEL: @test_const_shr_and_1_ne_0(
+; CHECK-NEXT: [[TMP1:%.*]] = shl nuw i32 1, [[B:%.*]]
+; CHECK-NEXT: [[AND:%.*]] = and i32 [[TMP1]], 42
+; CHECK-NEXT: [[CMP:%.*]] = icmp ne i32 [[AND]], 0
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %shr = lshr i32 42, %b
+ %and = and i32 %shr, 1
+ %cmp = icmp ne i32 %and, 0
+ ret i1 %cmp
+}
+
+define i1 @test_not_const_shr_and_1_ne_0(i32 %b) {
+; CHECK-LABEL: @test_not_const_shr_and_1_ne_0(
+; CHECK-NEXT: [[TMP1:%.*]] = shl nuw i32 1, [[B:%.*]]
+; CHECK-NEXT: [[AND:%.*]] = and i32 [[TMP1]], 42
+; CHECK-NEXT: [[CMP_NOT:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT: ret i1 [[CMP_NOT]]
+;
+ %shr = lshr i32 42, %b
+ %and = and i32 %shr, 1
+ %cmp = icmp eq i32 %and, 0
+ ret i1 %cmp
+}
+
+define i1 @test_const_shr_exact_and_1_ne_0(i32 %b) {
+; CHECK-LABEL: @test_const_shr_exact_and_1_ne_0(
+; CHECK-NEXT: [[TMP1:%.*]] = shl nuw i32 1, [[B:%.*]]
+; CHECK-NEXT: [[AND:%.*]] = and i32 [[TMP1]], 42
+; CHECK-NEXT: [[CMP:%.*]] = icmp ne i32 [[AND]], 0
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %shr = lshr exact i32 42, %b
+ %and = and i32 %shr, 1
+ %cmp = icmp ne i32 %and, 0
+ ret i1 %cmp
+}
+
+define i1 @test_const_shr_and_2_ne_0_negative(i32 %b) {
+; CHECK-LABEL: @test_const_shr_and_2_ne_0_negative(
+; CHECK-NEXT: [[SHR:%.*]] = lshr i32 42, [[B:%.*]]
+; CHECK-NEXT: [[AND:%.*]] = and i32 [[SHR]], 2
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[AND]], 0
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %shr = lshr i32 42, %b
+ %and = and i32 %shr, 2
+ %cmp = icmp eq i32 %and, 0
+ ret i1 %cmp
+}
+
+define <8 x i1> @test_const_shr_and_1_ne_0_v8i8_splat_negative(<8 x i8> %b) {
+; CHECK-LABEL: @test_const_shr_and_1_ne_0_v8i8_splat_negative(
+; CHECK-NEXT: [[SHR:%.*]] = lshr <8 x i8> <i8 42, i8 42, i8 42, i8 42, i8 42, i8 42, i8 42, i8 42>, [[B:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = trunc <8 x i8> [[SHR]] to <8 x i1>
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %shr = lshr <8 x i8> <i8 42, i8 42, i8 42, i8 42, i8 42, i8 42, i8 42, i8 42>, %b
+ %and = and <8 x i8> %shr, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
+ %cmp = icmp ne <8 x i8> %and, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @test_const_shr_and_1_ne_0_v8i8_nonsplat_negative(<8 x i8> %b) {
+; CHECK-LABEL: @test_const_shr_and_1_ne_0_v8i8_nonsplat_negative(
+; CHECK-NEXT: [[SHR:%.*]] = lshr <8 x i8> <i8 42, i8 43, i8 44, i8 45, i8 46, i8 47, i8 48, i8 49>, [[B:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = trunc <8 x i8> [[SHR]] to <8 x i1>
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %shr = lshr <8 x i8> <i8 42, i8 43, i8 44, i8 45, i8 46, i8 47, i8 48, i8 49>, %b
+ %and = and <8 x i8> %shr, <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>
+ %cmp = icmp ne <8 x i8> %and, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define i1 @test_const_shr_and_1_ne_0_i1_negative(i1 %b) {
+; CHECK-LABEL: @test_const_shr_and_1_ne_0_i1_negative(
+; CHECK-NEXT: ret i1 true
+;
+ %shr = lshr i1 1, %b
+ %and = and i1 %shr, 1
+ %cmp = icmp ne i1 %and, 0
+ ret i1 %cmp
+}
+
+define i1 @test_const_shr_and_1_ne_0_multi_use_lshr_negative(i32 %b) {
+; CHECK-LABEL: @test_const_shr_and_1_ne_0_multi_use_lshr_negative(
+; CHECK-NEXT: [[SHR:%.*]] = lshr i32 42, [[B:%.*]]
+; CHECK-NEXT: [[AND:%.*]] = and i32 [[SHR]], 1
+; CHECK-NEXT: [[CMP1:%.*]] = icmp ne i32 [[AND]], 0
+; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i32 [[SHR]], [[B]]
+; CHECK-NEXT: [[RET:%.*]] = and i1 [[CMP1]], [[CMP2]]
+; CHECK-NEXT: ret i1 [[RET]]
+;
+ %shr = lshr i32 42, %b
+ %and = and i32 %shr, 1
+ %cmp1 = icmp ne i32 %and, 0
+ %cmp2 = icmp eq i32 %b, %shr
+ %ret = and i1 %cmp1, %cmp2
+ ret i1 %ret
+}
+
+define i1 @test_const_shr_and_1_ne_0_multi_use_and_negative(i32 %b) {
+; CHECK-LABEL: @test_const_shr_and_1_ne_0_multi_use_and_negative(
+; CHECK-NEXT: [[SHR:%.*]] = lshr i32 42, [[B:%.*]]
+; CHECK-NEXT: [[AND:%.*]] = and i32 [[SHR]], 1
+; CHECK-NEXT: [[CMP1:%.*]] = icmp ne i32 [[AND]], 0
+; CHECK-NEXT: [[CMP2:%.*]] = icmp eq i32 [[AND]], [[B]]
+; CHECK-NEXT: [[RET:%.*]] = and i1 [[CMP1]], [[CMP2]]
+; CHECK-NEXT: ret i1 [[RET]]
+;
+ %shr = lshr i32 42, %b
+ %and = and i32 %shr, 1
+ %cmp1 = icmp ne i32 %and, 0
+ %cmp2 = icmp eq i32 %b, %and
+ %ret = and i1 %cmp1, %cmp2
+ ret i1 %ret
+}
diff --git a/llvm/test/Transforms/InstCombine/load-cmp.ll b/llvm/test/Transforms/InstCombine/load-cmp.ll
index b956de29e0b8d..8e39fe33cded8 100644
--- a/llvm/test/Transforms/InstCombine/load-cmp.ll
+++ b/llvm/test/Transforms/InstCombine/load-cmp.ll
@@ -109,8 +109,8 @@ define i1 @test3(i32 %X) {
define i1 @test4(i32 %X) {
; CHECK-LABEL: @test4(
-; CHECK-NEXT: [[TMP1:%.*]] = lshr i32 933, [[X:%.*]]
-; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 1
+; CHECK-NEXT: [[TMP1:%.*]] = shl nuw i32 1, [[X:%.*]]
+; CHECK-NEXT: [[TMP2:%.*]] = and i32 [[TMP1]], 933
; CHECK-NEXT: [[R:%.*]] = icmp ne i32 [[TMP2]], 0
; CHECK-NEXT: ret i1 [[R]]
;
@@ -123,8 +123,8 @@ define i1 @test4(i32 %X) {
define i1 @test4_i16(i16 %X) {
; CHECK-LABEL: @test4_i16(
; CHECK-NEXT: [[TMP1:%.*]] = zext nneg i16 [[X:%.*]] to i32
-; CHECK-NEXT: [[TMP2:%.*]] = lshr i32 933, [[TMP1]]
-; CHECK-NEXT: [[TMP3:%.*]] = and i32 [[TMP2]], 1
+; CHECK-NEXT: [[TMP2:%.*]] = shl nuw i32 1, [[TMP1]]
+; CHECK-NEXT: [[TMP3:%.*]] = and i32 [[TMP2]], 933
; CHECK-NEXT: [[R:%.*]] = icmp ne i32 [[TMP3]], 0
; CHECK-NEXT: ret i1 [[R]]
;
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you!
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but it looks like there is a regression in this case: https://llvm.godbolt.org/z/8dGhf9Wz1
Good point. I think we could adapt the transformation in InstCombineCasts.cpp:988 to also support smaller bit widths (and insert a trunc in this case). I could work on such a change if this is fine. |
Sound good to me! |
Previously, (zext (icmp ne (and X, (1 << ShAmt)), 0)) has only been folded if the bit width of X and the result were equal. Use a trunc or zext instruction to also support other bit widths. This is a follow-up to commit 533190a, which introduced a regression: (zext (icmp ne (and (lshr X ShAmt) 1) 0)) is not folded any longer to (zext/trunc (and (lshr X ShAmt) 1)) since the commit introduced the fold of (icmp ne (and (lshr X ShAmt) 1) 0) to (icmp ne (and X (1 << ShAmt)) 0). The change introduced by this commit restores this fold. Alive proof: https://alive2.llvm.org/ce/z/MFkNXs Relates to issue #86813 and pull request #101838.
Previously, (zext (icmp ne (and X, (1 << ShAmt)), 0)) has only been folded if the bit width of X and the result were equal. Use a trunc or zext instruction to also support other bit widths. This is a follow-up to commit 533190a, which introduced a regression: (zext (icmp ne (and (lshr X ShAmt) 1) 0)) is not folded any longer to (zext/trunc (and (lshr X ShAmt) 1)) since the commit introduced the fold of (icmp ne (and (lshr X ShAmt) 1) 0) to (icmp ne (and X (1 << ShAmt)) 0). The change introduced by this commit restores this fold. Alive proof: https://alive2.llvm.org/ce/z/MFkNXs Relates to issue llvm#86813 and pull request llvm#101838.
Implement a new transformation that fold the bit-testing expression (icmp ne (and (lshr V B) 1) 0) to (icmp ne (and V (shl 1 B)) 0) for constant V. This rule already existed for non-constant V and constants other than 1; this restriction to non-constant V has been added in commit c3b2111 to fix an infinite loop. Avoid the infinite loop by allowing constant V only if the shift instruction is an lshr and the constant is 1. Also fold the negated variant of the LHS.
This transformation necessitates an adaption of existing tests in
icmp-and-shift.ll
andload-cmp.ll
. One test inicmp-and-shift.ll
, which previously was a negative test, now gets folded. Rename it to indicate that it is a positive test.Alive proof: https://alive2.llvm.org/ce/z/vcJJTx
Relates to issue #86813.