Skip to content

ValueTracking: simplify udiv/urem recurrences #108973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 7, 2024

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Sep 17, 2024

A urem recurrence has the property that the result can never exceed the start value. A udiv recurrence has the property that the result can never exceed either the start value or the numerator, whichever is greater. Implement a simplification based on these properties.

@artagnon artagnon marked this pull request as ready for review September 18, 2024 21:04
@artagnon artagnon requested a review from nikic as a code owner September 18, 2024 21:04
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Sep 18, 2024
@artagnon artagnon requested a review from dtcxzyw September 18, 2024 21:04
@llvmbot
Copy link
Member

llvmbot commented Sep 18, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Ramkumar Ramachandra (artagnon)

Changes

matchSimpleRecurrence only misses recurrence cases for UDiv, SDiv, URem, and SRem, as no meaningful simplification can be performed on an Xor recurrence using the KnownBits infrastructure; Xor with zero is already simplified. Hence, write simplifications for UDiv, SDiv, URem and SRem recurrences using KnownBits, completing the function.

-- 8< --
Based on #109198.


Full diff: https://github.com/llvm/llvm-project/pull/108973.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+47-27)
  • (modified) llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll (+200)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index ba3ba7cc98136d..f19397d6810d03 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1451,34 +1451,34 @@ static void computeKnownBitsFromOperator(const Operator *I,
         };
       }
 
-      // Check for operations that have the property that if
-      // both their operands have low zero bits, the result
-      // will have low zero bits.
-      if (Opcode == Instruction::Add ||
-          Opcode == Instruction::Sub ||
-          Opcode == Instruction::And ||
-          Opcode == Instruction::Or ||
-          Opcode == Instruction::Mul) {
-        // Change the context instruction to the "edge" that flows into the
-        // phi. This is important because that is where the value is actually
-        // "evaluated" even though it is used later somewhere else. (see also
-        // D69571).
-        SimplifyQuery RecQ = Q.getWithoutCondContext();
+      // Change the context instruction to the "edge" that flows into the
+      // phi. This is important because that is where the value is actually
+      // "evaluated" even though it is used later somewhere else. (see also
+      // D69571).
+      SimplifyQuery RecQ = Q.getWithoutCondContext();
 
-        unsigned OpNum = P->getOperand(0) == R ? 0 : 1;
-        Instruction *RInst = P->getIncomingBlock(OpNum)->getTerminator();
-        Instruction *LInst = P->getIncomingBlock(1 - OpNum)->getTerminator();
+      unsigned OpNum = P->getOperand(0) == R ? 0 : 1;
+      Instruction *RInst = P->getIncomingBlock(OpNum)->getTerminator();
+      Instruction *LInst = P->getIncomingBlock(1 - OpNum)->getTerminator();
 
-        // Ok, we have a PHI of the form L op= R. Check for low
-        // zero bits.
-        RecQ.CxtI = RInst;
-        computeKnownBits(R, DemandedElts, Known2, Depth + 1, RecQ);
+      // Ok, we have a PHI of the form L op= R. R and Known2 correspond to the
+      // start value.
+      RecQ.CxtI = RInst;
+      computeKnownBits(R, DemandedElts, Known2, Depth + 1, RecQ);
 
-        // We need to take the minimum number of known bits
-        KnownBits Known3(BitWidth);
-        RecQ.CxtI = LInst;
-        computeKnownBits(L, DemandedElts, Known3, Depth + 1, RecQ);
+      KnownBits Known3(BitWidth);
+      RecQ.CxtI = LInst;
+      computeKnownBits(L, DemandedElts, Known3, Depth + 1, RecQ);
 
+      switch (Opcode) {
+      // Check for operations that have the property that if both their operands
+      // have low zero bits, the result will have low zero bits.
+      case Instruction::Add:
+      case Instruction::Sub:
+      case Instruction::And:
+      case Instruction::Or:
+      case Instruction::Mul: {
+        // We need to take the minimum number of known bits
         Known.Zero.setLowBits(std::min(Known2.countMinTrailingZeros(),
                                        Known3.countMinTrailingZeros()));
 
@@ -1514,9 +1514,26 @@ static void computeKnownBitsFromOperator(const Operator *I,
                    Known3.isNonNegative())
             Known.makeNonNegative();
         }
-
         break;
       }
+      case Instruction::UDiv:
+      case Instruction::URem:
+         // Result cannot be larger than start value.
+         Known.Zero.setHighBits(Known2.countMinLeadingZeros());
+         break;
+      case Instruction::SDiv:
+      case Instruction::SRem: {
+        // Magnitude of result cannot be larger than that of start value.
+        if (Known2.isNonNegative())
+          Known.Zero.setHighBits(Known2.countMinLeadingZeros());
+        else if (Known2.isNegative())
+          Known.One.setHighBits(Known2.countMinLeadingOnes());
+        break;
+      }
+      default:
+        break;
+      }
+      break;
     }
 
     // Unreachable blocks may have zero-operand PHI nodes.
@@ -8968,7 +8985,6 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
     switch (Opcode) {
     default:
       continue;
-    // TODO: Expand list -- xor, div, gep, uaddo, etc..
     case Instruction::LShr:
     case Instruction::AShr:
     case Instruction::Shl:
@@ -8977,7 +8993,11 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
     case Instruction::And:
     case Instruction::Or:
     case Instruction::Mul:
-    case Instruction::FMul: {
+    case Instruction::FMul:
+    case Instruction::UDiv:
+    case Instruction::SDiv:
+    case Instruction::URem:
+    case Instruction::SRem: {
       Value *LL = LU->getOperand(0);
       Value *LR = LU->getOperand(1);
       // Find a recurrence.
diff --git a/llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll b/llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll
index 3355328cad9ecf..cdf55ba6e92f0a 100644
--- a/llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll
+++ b/llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll
@@ -81,6 +81,206 @@ exit:
   ret i64 %res
 }
 
+define i64 @test_udiv(i1 %c) {
+; CHECK-LABEL: @test_udiv(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 0
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [9, %entry], [%iv.next, %loop]
+  %iv.next = udiv i64 %iv, 3
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, 16
+  ret i64 %res
+}
+
+define i64 @test_sdiv(i1 %c) {
+; CHECK-LABEL: @test_sdiv(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 16
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [-9, %entry], [%iv.next, %loop]
+  %iv.next = sdiv i64 %iv, -3
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, 16
+  ret i64 %res
+}
+
+define i64 @test_sdiv2(i1 %c) {
+; CHECK-LABEL: @test_sdiv2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 16
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [-9, %entry], [%iv.next, %loop]
+  %iv.next = sdiv i64 %iv, 3
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, 16
+  ret i64 %res
+}
+
+define i64 @test_sdiv3(i1 %c) {
+; CHECK-LABEL: @test_sdiv3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 0
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [9, %entry], [%iv.next, %loop]
+  %iv.next = sdiv i64 %iv, -3
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, -16
+  ret i64 %res
+}
+
+define i64 @test_urem(i1 %c) {
+; CHECK-LABEL: @test_urem(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 0
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [3, %entry], [%iv.next, %loop]
+  %iv.next = urem i64 9, %iv
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, 4
+  ret i64 %res
+}
+
+define i64 @test_srem(i1 %c) {
+; CHECK-LABEL: @test_srem(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 16
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [-9, %entry], [%iv.next, %loop]
+  %iv.next = srem i64 %iv, -3
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, 16
+  ret i64 %res
+}
+
+define i64 @test_srem2(i1 %c) {
+; CHECK-LABEL: @test_srem2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 16
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [-9, %entry], [%iv.next, %loop]
+  %iv.next = srem i64 %iv, 3
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, 16
+  ret i64 %res
+}
+
+define i64 @test_srem2_flipped(i1 %c) {
+; CHECK-LABEL: @test_srem2_flipped(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 0
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [3, %entry], [%iv.next, %loop]
+  %iv.next = srem i64 -9, %iv
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, 4
+  ret i64 %res
+}
+
+define i64 @test_srem3(i1 %c) {
+; CHECK-LABEL: @test_srem3(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 0
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [9, %entry], [%iv.next, %loop]
+  %iv.next = srem i64 %iv, -3
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, -16
+  ret i64 %res
+}
+
+define i64 @test_srem3_flipped(i1 %c) {
+; CHECK-LABEL: @test_srem3_flipped(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br label [[LOOP:%.*]]
+; CHECK:       loop:
+; CHECK-NEXT:    br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK:       exit:
+; CHECK-NEXT:    ret i64 -4
+;
+entry:
+  br label %loop
+loop:
+  %iv = phi i64 [-3, %entry], [%iv.next, %loop]
+  %iv.next = srem i64 9, %iv
+  br i1 %c, label %exit, label %loop
+exit:
+  %res = and i64 %iv, -4
+  ret i64 %res
+}
+
 define i64 @test_and(i1 %c) {
 ; CHECK-LABEL: @test_and(
 ; CHECK-NEXT:  entry:

Copy link

github-actions bot commented Sep 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@artagnon artagnon marked this pull request as draft September 19, 2024 09:33
@artagnon
Copy link
Contributor Author

Sorry, this is incorrect, and there are subtle problems with it. Is it possible to experiment with an Alive2 proof for this example? (I believe Alive2 doesn't handle loops)

@dtcxzyw
Copy link
Member

dtcxzyw commented Sep 19, 2024

Sorry, this is incorrect, and there are subtle problems with it. Is it possible to experiment with an Alive2 proof for this example? (I believe Alive2 doesn't handle loops)

You can try with -src-unroll/-tgt-unroll.

@artagnon artagnon changed the title ValueTracking: complete matchSimpleRecurrence ValueTracking: simplify udiv/urem recurrences Oct 7, 2024
@artagnon artagnon marked this pull request as ready for review October 7, 2024 09:54
@artagnon
Copy link
Contributor Author

artagnon commented Oct 7, 2024

As far as I can tell, there is no possible sdiv/srem recurrence simplification: with negative values, the result alternates between positive and negative values. I've now implemented just udiv/urem simplification, and I think it is simple enough that the patch should be good.

@artagnon
Copy link
Contributor Author

Gentle ping.

1 similar comment
@artagnon
Copy link
Contributor Author

Gentle ping.

@artagnon
Copy link
Contributor Author

artagnon commented Nov 4, 2024

Gentle ping.

@artagnon
Copy link
Contributor Author

artagnon commented Nov 4, 2024

This patch is a very minor improvement, but an improvement nevertheless. I don't see it regressing compile-time, so it should be good, no?

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this code be integrated in the shift handling above? It seems pretty similar -- but not quite the same, and I'm suspicious of differences :)

@artagnon
Copy link
Contributor Author

artagnon commented Nov 5, 2024

Can this code be integrated in the shift handling above? It seems pretty similar -- but not quite the same, and I'm suspicious of differences :)

The only difference is a smarter context. Given that the context is a little difficult to test, I've integrated the code at the cost of a worse context. We can try to improve the context in a follow-up, but I'm not sure what tests we can write.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some nits.

udiv and urem recurrences have the property that the result can never
exceed the start value. Implement a simplification based on this
property.
@artagnon artagnon merged commit fef6613 into llvm:main Nov 7, 2024
4 of 6 checks passed
@artagnon artagnon deleted the vt-recur-divrem branch November 7, 2024 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants