-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SCEVExpander] Recognize urem idiom during expansion #96005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If we have a urem expression, emitting it as a urem is significantly better that letting the fully expansion kick in. We have the risk of a udiv or mul which could have previously been shared, but loosing that seems like a reasonable tradeoff for being able to round trip a urem w/o modification.
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-llvm-transforms Author: Philip Reames (preames) ChangesIf we have a urem expression, emitting it as a urem is significantly better that letting the fully expansion kick in. We have the risk of a udiv or mul which could have previously been shared, but loosing that seems like a reasonable tradeoff for being able to round trip a urem w/o modification. Full diff: https://github.com/llvm/llvm-project/pull/96005.diff 4 Files Affected:
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 9808308cbfed9..03c51d86f891d 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -14972,6 +14972,10 @@ void PredicatedScalarEvolution::print(raw_ostream &OS, unsigned Depth) const {
// 4, A / B becomes X / 8).
bool ScalarEvolution::matchURem(const SCEV *Expr, const SCEV *&LHS,
const SCEV *&RHS) {
+
+ if (Expr->getType()->isPointerTy())
+ return false;
+
// Try to match 'zext (trunc A to iB) to iY', which is used
// for URem with constant power-of-2 second operands. Make sure the size of
// the operand A matches the size of the whole expressions.
diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index c437a44dda8d3..32b2127f5fc5e 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -491,6 +491,17 @@ class LoopCompare {
}
Value *SCEVExpander::visitAddExpr(const SCEVAddExpr *S) {
+
+ // Recognize the canonical representation of an unsimplifed urem.
+ const SCEV *URemLHS = nullptr;
+ const SCEV *URemRHS = nullptr;
+ if (SE.matchURem(S, URemLHS, URemRHS)) {
+ Value *LHS = expand(URemLHS);
+ Value *RHS = expand(URemRHS);
+ return InsertBinop(Instruction::URem, LHS, RHS, SCEV::FlagAnyWrap,
+ /*IsSafeToHoist*/ false);
+ }
+
// Collect all the add operands in a loop, along with their associated loops.
// Iterate in reverse so that constants are emitted last, all else equal, and
// so that pointer operands are inserted first, which the code below relies on
diff --git a/llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll b/llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll
index 95b064bd044fa..838b48aa56906 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll
@@ -22,9 +22,7 @@ define i32 @test_pr38847() {
; CHECK-NEXT: [[CMP2:%.*]] = icmp sgt i8 [[LSR]], -1
; CHECK-NEXT: br i1 [[CMP2]], label [[LOOP]], label [[EXIT:%.*]]
; CHECK: exit:
-; CHECK-NEXT: [[TMP0:%.*]] = udiv i32 [[LSR_IV_NEXT2]], 9
-; CHECK-NEXT: [[TMP1:%.*]] = mul nuw i32 [[TMP0]], 9
-; CHECK-NEXT: [[TMP2:%.*]] = sub i32 [[LSR_IV_NEXT2]], [[TMP1]]
+; CHECK-NEXT: [[TMP2:%.*]] = urem i32 [[LSR_IV_NEXT2]], 9
; CHECK-NEXT: ret i32 [[TMP2]]
;
entry:
@@ -109,9 +107,7 @@ define i32 @test_pr62852() {
; CHECK: exit:
; CHECK-NEXT: call void @use(i64 [[LSR_IV_NEXT]])
; CHECK-NEXT: call void @use(i64 [[LSR_IV_NEXT2]])
-; CHECK-NEXT: [[TMP1:%.*]] = udiv i32 [[DEC_1]], 53
-; CHECK-NEXT: [[TMP2:%.*]] = mul nuw i32 [[TMP1]], 53
-; CHECK-NEXT: [[TMP3:%.*]] = sub i32 [[DEC_1]], [[TMP2]]
+; CHECK-NEXT: [[TMP3:%.*]] = urem i32 [[DEC_1]], 53
; CHECK-NEXT: ret i32 [[TMP3]]
;
entry:
diff --git a/llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll b/llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll
index ce5d4427fa019..bf687a31d8d17 100644
--- a/llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll
+++ b/llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll
@@ -723,9 +723,7 @@ define i64 @multi_exit_4_exit_count_with_urem_by_constant_in_latch(ptr %dst, i64
; CHECK-SAME: ptr [[DST:%.*]], i64 [[N:%.*]]) {
; CHECK-NEXT: entry:
; CHECK-NEXT: [[SMAX:%.*]] = call i64 @llvm.smax.i64(i64 [[N]], i64 0)
-; CHECK-NEXT: [[TMP0:%.*]] = udiv i64 [[N]], 42
-; CHECK-NEXT: [[TMP1:%.*]] = mul nuw i64 [[TMP0]], 42
-; CHECK-NEXT: [[TMP2:%.*]] = sub i64 [[N]], [[TMP1]]
+; CHECK-NEXT: [[TMP2:%.*]] = urem i64 [[N]], 42
; CHECK-NEXT: [[SMAX1:%.*]] = call i64 @llvm.smax.i64(i64 [[TMP2]], i64 0)
; CHECK-NEXT: [[UMIN:%.*]] = call i64 @llvm.umin.i64(i64 [[SMAX]], i64 [[SMAX1]])
; CHECK-NEXT: [[TMP3:%.*]] = add nuw i64 [[UMIN]], 1
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -14972,6 +14972,10 @@ void PredicatedScalarEvolution::print(raw_ostream &OS, unsigned Depth) const { | |||
// 4, A / B becomes X / 8). | |||
bool ScalarEvolution::matchURem(const SCEV *Expr, const SCEV *&LHS, | |||
const SCEV *&RHS) { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Drop newline at start of function.
@@ -491,6 +491,17 @@ class LoopCompare { | |||
} | |||
|
|||
Value *SCEVExpander::visitAddExpr(const SCEVAddExpr *S) { | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Drop newline at start of function.
If we have a urem expression, emitting it as a urem is significantly better that letting the fully expansion kick in. We have the risk of a udiv or mul which could have previously been shared, but loosing that seems like a reasonable tradeoff for being able to round trip a urem w/o modification.
If we have a urem expression, emitting it as a urem is significantly better that letting the fully expansion kick in. We have the risk of a udiv or mul which could have previously been shared, but loosing that seems like a reasonable tradeoff for being able to round trip a urem w/o modification.