-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SLP]Add better minbitwidth analysis for udiv/urem instructions. #85928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SLP]Add better minbitwidth analysis for udiv/urem instructions. #85928
Conversation
Created using spr 1.3.5
@llvm/pr-subscribers-llvm-transforms Author: Alexey Bataev (alexey-bataev) ChangesAdds improved bitwidth analysis for udiv/urem instructions. The Full diff: https://github.com/llvm/llvm-project/pull/85928.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 5d59f35f30810e..f8f7015e5f15f2 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -14165,6 +14165,28 @@ bool BoUpSLP::collectValuesToDemote(
return false;
break;
}
+ case Instruction::UDiv:
+ case Instruction::URem: {
+ if (ITE->UserTreeIndices.size() > 1 && !IsPotentiallyTruncated(I, BitWidth))
+ return false;
+ // UDiv and URem can be truncated if all the truncated bits are zero.
+ if (!AttemptCheckBitwidth(
+ [&](unsigned BitWidth, unsigned OrigBitWidth) {
+ assert(BitWidth <= OrigBitWidth && "Unexpected bitwidths!");
+ APInt Mask = APInt::getBitsSetFrom(OrigBitWidth, BitWidth);
+ return MaskedValueIsZero(I->getOperand(0), Mask,
+ SimplifyQuery(*DL)) &&
+ MaskedValueIsZero(I->getOperand(1), Mask,
+ SimplifyQuery(*DL));
+ },
+ NeedToExit))
+ return false;
+ if (NeedToExit)
+ return true;
+ if (!ProcessOperands({I->getOperand(0), I->getOperand(1)}, NeedToExit))
+ return false;
+ break;
+ }
// We can demote selects if we can demote their true and false values.
case Instruction::Select: {
diff --git a/llvm/test/Transforms/SLPVectorizer/X86/reorder-possible-strided-node.ll b/llvm/test/Transforms/SLPVectorizer/X86/reorder-possible-strided-node.ll
index 4a23abf182e888..cfbbe14186b501 100644
--- a/llvm/test/Transforms/SLPVectorizer/X86/reorder-possible-strided-node.ll
+++ b/llvm/test/Transforms/SLPVectorizer/X86/reorder-possible-strided-node.ll
@@ -116,9 +116,7 @@ define void @test_div() {
; CHECK-NEXT: [[TMP1:%.*]] = load <4 x i32>, ptr [[ARRAYIDX22]], align 4
; CHECK-NEXT: [[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
; CHECK-NEXT: [[TMP3:%.*]] = mul <4 x i32> [[TMP2]], [[TMP0]]
-; CHECK-NEXT: [[TMP4:%.*]] = zext <4 x i32> [[TMP3]] to <4 x i64>
-; CHECK-NEXT: [[TMP5:%.*]] = udiv <4 x i64> [[TMP4]], <i64 1, i64 2, i64 1, i64 2>
-; CHECK-NEXT: [[TMP6:%.*]] = trunc <4 x i64> [[TMP5]] to <4 x i32>
+; CHECK-NEXT: [[TMP6:%.*]] = udiv <4 x i32> [[TMP3]], <i32 1, i32 2, i32 1, i32 2>
; CHECK-NEXT: store <4 x i32> [[TMP6]], ptr getelementptr inbounds ([4 x i32], ptr null, i64 8, i64 0), align 16
; CHECK-NEXT: ret void
;
@@ -170,9 +168,7 @@ define void @test_rem() {
; CHECK-NEXT: [[TMP1:%.*]] = load <4 x i32>, ptr [[ARRAYIDX22]], align 4
; CHECK-NEXT: [[TMP2:%.*]] = shufflevector <4 x i32> [[TMP1]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
; CHECK-NEXT: [[TMP3:%.*]] = mul <4 x i32> [[TMP2]], [[TMP0]]
-; CHECK-NEXT: [[TMP4:%.*]] = zext <4 x i32> [[TMP3]] to <4 x i64>
-; CHECK-NEXT: [[TMP5:%.*]] = urem <4 x i64> [[TMP4]], <i64 1, i64 2, i64 1, i64 1>
-; CHECK-NEXT: [[TMP6:%.*]] = trunc <4 x i64> [[TMP5]] to <4 x i32>
+; CHECK-NEXT: [[TMP6:%.*]] = urem <4 x i32> [[TMP3]], <i32 1, i32 2, i32 1, i32 1>
; CHECK-NEXT: store <4 x i32> [[TMP6]], ptr getelementptr inbounds ([4 x i32], ptr null, i64 8, i64 0), align 16
; CHECK-NEXT: ret void
;
|
MaskedValueIsZero(I->getOperand(1), Mask, | ||
SimplifyQuery(*DL)); | ||
}, | ||
NeedToExit)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is starting to have a lot of code similar to canEvaluateTruncated in InstCombineCasts.cpp - can we reuse that code somehow - move it to ValueTracking.h or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought about this. Actually, only the code for shifts/div/rem is duplicated, all other code is unique and/or requires some extra processing, which canEvaluateTruncated does not have. We can think about it later.
✅ With the latest revision this PR passed the Python code formatter. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Ping! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Adds improved bitwidth analysis for udiv/urem instructions. The
analysis is based on similar version in InstCombiner.