Skip to content

[SCEVExpander] Recognize urem idiom during expansion #96005

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 19, 2024

Conversation

preames
Copy link
Collaborator

@preames preames commented Jun 18, 2024

If we have a urem expression, emitting it as a urem is significantly better that letting the fully expansion kick in. We have the risk of a udiv or mul which could have previously been shared, but loosing that seems like a reasonable tradeoff for being able to round trip a urem w/o modification.

If we have a urem expression, emitting it as a urem is significantly better
that letting the fully expansion kick in.  We have the risk of a udiv or
mul which could have previously been shared, but loosing that seems like
a reasonable tradeoff for being able to round trip a urem w/o modification.
@preames preames requested a review from nikic as a code owner June 18, 2024 22:37
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Jun 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Philip Reames (preames)

Changes

If we have a urem expression, emitting it as a urem is significantly better that letting the fully expansion kick in. We have the risk of a udiv or mul which could have previously been shared, but loosing that seems like a reasonable tradeoff for being able to round trip a urem w/o modification.


Full diff: https://github.com/llvm/llvm-project/pull/96005.diff

4 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+4)
  • (modified) llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp (+11)
  • (modified) llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll (+2-6)
  • (modified) llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll (+1-3)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 9808308cbfed9..03c51d86f891d 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -14972,6 +14972,10 @@ void PredicatedScalarEvolution::print(raw_ostream &OS, unsigned Depth) const {
 // 4, A / B becomes X / 8).
 bool ScalarEvolution::matchURem(const SCEV *Expr, const SCEV *&LHS,
                                 const SCEV *&RHS) {
+
+  if (Expr->getType()->isPointerTy())
+    return false;
+
   // Try to match 'zext (trunc A to iB) to iY', which is used
   // for URem with constant power-of-2 second operands. Make sure the size of
   // the operand A matches the size of the whole expressions.
diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index c437a44dda8d3..32b2127f5fc5e 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -491,6 +491,17 @@ class LoopCompare {
 }
 
 Value *SCEVExpander::visitAddExpr(const SCEVAddExpr *S) {
+
+  // Recognize the canonical representation of an unsimplifed urem.
+  const SCEV *URemLHS = nullptr;
+  const SCEV *URemRHS = nullptr;
+  if (SE.matchURem(S, URemLHS, URemRHS)) {
+    Value *LHS = expand(URemLHS);
+    Value *RHS = expand(URemRHS);
+    return InsertBinop(Instruction::URem, LHS, RHS, SCEV::FlagAnyWrap,
+                      /*IsSafeToHoist*/ false);
+  }
+
   // Collect all the add operands in a loop, along with their associated loops.
   // Iterate in reverse so that constants are emitted last, all else equal, and
   // so that pointer operands are inserted first, which the code below relies on
diff --git a/llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll b/llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll
index 95b064bd044fa..838b48aa56906 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/X86/postinc-iv-used-by-urem-and-udiv.ll
@@ -22,9 +22,7 @@ define i32 @test_pr38847() {
 ; CHECK-NEXT:    [[CMP2:%.*]] = icmp sgt i8 [[LSR]], -1
 ; CHECK-NEXT:    br i1 [[CMP2]], label [[LOOP]], label [[EXIT:%.*]]
 ; CHECK:       exit:
-; CHECK-NEXT:    [[TMP0:%.*]] = udiv i32 [[LSR_IV_NEXT2]], 9
-; CHECK-NEXT:    [[TMP1:%.*]] = mul nuw i32 [[TMP0]], 9
-; CHECK-NEXT:    [[TMP2:%.*]] = sub i32 [[LSR_IV_NEXT2]], [[TMP1]]
+; CHECK-NEXT:    [[TMP2:%.*]] = urem i32 [[LSR_IV_NEXT2]], 9
 ; CHECK-NEXT:    ret i32 [[TMP2]]
 ;
 entry:
@@ -109,9 +107,7 @@ define i32 @test_pr62852() {
 ; CHECK:       exit:
 ; CHECK-NEXT:    call void @use(i64 [[LSR_IV_NEXT]])
 ; CHECK-NEXT:    call void @use(i64 [[LSR_IV_NEXT2]])
-; CHECK-NEXT:    [[TMP1:%.*]] = udiv i32 [[DEC_1]], 53
-; CHECK-NEXT:    [[TMP2:%.*]] = mul nuw i32 [[TMP1]], 53
-; CHECK-NEXT:    [[TMP3:%.*]] = sub i32 [[DEC_1]], [[TMP2]]
+; CHECK-NEXT:    [[TMP3:%.*]] = urem i32 [[DEC_1]], 53
 ; CHECK-NEXT:    ret i32 [[TMP3]]
 ;
 entry:
diff --git a/llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll b/llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll
index ce5d4427fa019..bf687a31d8d17 100644
--- a/llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll
+++ b/llvm/test/Transforms/LoopVectorize/trip-count-expansion-may-introduce-ub.ll
@@ -723,9 +723,7 @@ define i64 @multi_exit_4_exit_count_with_urem_by_constant_in_latch(ptr %dst, i64
 ; CHECK-SAME: ptr [[DST:%.*]], i64 [[N:%.*]]) {
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[SMAX:%.*]] = call i64 @llvm.smax.i64(i64 [[N]], i64 0)
-; CHECK-NEXT:    [[TMP0:%.*]] = udiv i64 [[N]], 42
-; CHECK-NEXT:    [[TMP1:%.*]] = mul nuw i64 [[TMP0]], 42
-; CHECK-NEXT:    [[TMP2:%.*]] = sub i64 [[N]], [[TMP1]]
+; CHECK-NEXT:    [[TMP2:%.*]] = urem i64 [[N]], 42
 ; CHECK-NEXT:    [[SMAX1:%.*]] = call i64 @llvm.smax.i64(i64 [[TMP2]], i64 0)
 ; CHECK-NEXT:    [[UMIN:%.*]] = call i64 @llvm.umin.i64(i64 [[SMAX]], i64 [[SMAX1]])
 ; CHECK-NEXT:    [[TMP3:%.*]] = add nuw i64 [[UMIN]], 1

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -14972,6 +14972,10 @@ void PredicatedScalarEvolution::print(raw_ostream &OS, unsigned Depth) const {
// 4, A / B becomes X / 8).
bool ScalarEvolution::matchURem(const SCEV *Expr, const SCEV *&LHS,
const SCEV *&RHS) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Drop newline at start of function.

@@ -491,6 +491,17 @@ class LoopCompare {
}

Value *SCEVExpander::visitAddExpr(const SCEVAddExpr *S) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Drop newline at start of function.

@preames preames merged commit cb76896 into llvm:main Jun 19, 2024
4 of 5 checks passed
@preames preames deleted the pr-scev-expander-urem-round-trip branch June 19, 2024 15:40
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
If we have a urem expression, emitting it as a urem is significantly
better that letting the fully expansion kick in. We have the risk of a
udiv or mul which could have previously been shared, but loosing that
seems like a reasonable tradeoff for being able to round trip a urem w/o
modification.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants