-
Notifications
You must be signed in to change notification settings - Fork 14.3k
ValueTracking: simplify udiv/urem recurrences #108973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
81c1be6
to
4a4d517
Compare
@llvm/pr-subscribers-llvm-analysis Author: Ramkumar Ramachandra (artagnon) ChangesmatchSimpleRecurrence only misses recurrence cases for UDiv, SDiv, URem, and SRem, as no meaningful simplification can be performed on an Xor recurrence using the KnownBits infrastructure; Xor with zero is already simplified. Hence, write simplifications for UDiv, SDiv, URem and SRem recurrences using KnownBits, completing the function. -- 8< -- Full diff: https://github.com/llvm/llvm-project/pull/108973.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index ba3ba7cc98136d..f19397d6810d03 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1451,34 +1451,34 @@ static void computeKnownBitsFromOperator(const Operator *I,
};
}
- // Check for operations that have the property that if
- // both their operands have low zero bits, the result
- // will have low zero bits.
- if (Opcode == Instruction::Add ||
- Opcode == Instruction::Sub ||
- Opcode == Instruction::And ||
- Opcode == Instruction::Or ||
- Opcode == Instruction::Mul) {
- // Change the context instruction to the "edge" that flows into the
- // phi. This is important because that is where the value is actually
- // "evaluated" even though it is used later somewhere else. (see also
- // D69571).
- SimplifyQuery RecQ = Q.getWithoutCondContext();
+ // Change the context instruction to the "edge" that flows into the
+ // phi. This is important because that is where the value is actually
+ // "evaluated" even though it is used later somewhere else. (see also
+ // D69571).
+ SimplifyQuery RecQ = Q.getWithoutCondContext();
- unsigned OpNum = P->getOperand(0) == R ? 0 : 1;
- Instruction *RInst = P->getIncomingBlock(OpNum)->getTerminator();
- Instruction *LInst = P->getIncomingBlock(1 - OpNum)->getTerminator();
+ unsigned OpNum = P->getOperand(0) == R ? 0 : 1;
+ Instruction *RInst = P->getIncomingBlock(OpNum)->getTerminator();
+ Instruction *LInst = P->getIncomingBlock(1 - OpNum)->getTerminator();
- // Ok, we have a PHI of the form L op= R. Check for low
- // zero bits.
- RecQ.CxtI = RInst;
- computeKnownBits(R, DemandedElts, Known2, Depth + 1, RecQ);
+ // Ok, we have a PHI of the form L op= R. R and Known2 correspond to the
+ // start value.
+ RecQ.CxtI = RInst;
+ computeKnownBits(R, DemandedElts, Known2, Depth + 1, RecQ);
- // We need to take the minimum number of known bits
- KnownBits Known3(BitWidth);
- RecQ.CxtI = LInst;
- computeKnownBits(L, DemandedElts, Known3, Depth + 1, RecQ);
+ KnownBits Known3(BitWidth);
+ RecQ.CxtI = LInst;
+ computeKnownBits(L, DemandedElts, Known3, Depth + 1, RecQ);
+ switch (Opcode) {
+ // Check for operations that have the property that if both their operands
+ // have low zero bits, the result will have low zero bits.
+ case Instruction::Add:
+ case Instruction::Sub:
+ case Instruction::And:
+ case Instruction::Or:
+ case Instruction::Mul: {
+ // We need to take the minimum number of known bits
Known.Zero.setLowBits(std::min(Known2.countMinTrailingZeros(),
Known3.countMinTrailingZeros()));
@@ -1514,9 +1514,26 @@ static void computeKnownBitsFromOperator(const Operator *I,
Known3.isNonNegative())
Known.makeNonNegative();
}
-
break;
}
+ case Instruction::UDiv:
+ case Instruction::URem:
+ // Result cannot be larger than start value.
+ Known.Zero.setHighBits(Known2.countMinLeadingZeros());
+ break;
+ case Instruction::SDiv:
+ case Instruction::SRem: {
+ // Magnitude of result cannot be larger than that of start value.
+ if (Known2.isNonNegative())
+ Known.Zero.setHighBits(Known2.countMinLeadingZeros());
+ else if (Known2.isNegative())
+ Known.One.setHighBits(Known2.countMinLeadingOnes());
+ break;
+ }
+ default:
+ break;
+ }
+ break;
}
// Unreachable blocks may have zero-operand PHI nodes.
@@ -8968,7 +8985,6 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
switch (Opcode) {
default:
continue;
- // TODO: Expand list -- xor, div, gep, uaddo, etc..
case Instruction::LShr:
case Instruction::AShr:
case Instruction::Shl:
@@ -8977,7 +8993,11 @@ bool llvm::matchSimpleRecurrence(const PHINode *P, BinaryOperator *&BO,
case Instruction::And:
case Instruction::Or:
case Instruction::Mul:
- case Instruction::FMul: {
+ case Instruction::FMul:
+ case Instruction::UDiv:
+ case Instruction::SDiv:
+ case Instruction::URem:
+ case Instruction::SRem: {
Value *LL = LU->getOperand(0);
Value *LR = LU->getOperand(1);
// Find a recurrence.
diff --git a/llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll b/llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll
index 3355328cad9ecf..cdf55ba6e92f0a 100644
--- a/llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll
+++ b/llvm/test/Analysis/ValueTracking/recurrence-knownbits.ll
@@ -81,6 +81,206 @@ exit:
ret i64 %res
}
+define i64 @test_udiv(i1 %c) {
+; CHECK-LABEL: @test_udiv(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 0
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [9, %entry], [%iv.next, %loop]
+ %iv.next = udiv i64 %iv, 3
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, 16
+ ret i64 %res
+}
+
+define i64 @test_sdiv(i1 %c) {
+; CHECK-LABEL: @test_sdiv(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 16
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [-9, %entry], [%iv.next, %loop]
+ %iv.next = sdiv i64 %iv, -3
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, 16
+ ret i64 %res
+}
+
+define i64 @test_sdiv2(i1 %c) {
+; CHECK-LABEL: @test_sdiv2(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 16
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [-9, %entry], [%iv.next, %loop]
+ %iv.next = sdiv i64 %iv, 3
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, 16
+ ret i64 %res
+}
+
+define i64 @test_sdiv3(i1 %c) {
+; CHECK-LABEL: @test_sdiv3(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 0
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [9, %entry], [%iv.next, %loop]
+ %iv.next = sdiv i64 %iv, -3
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, -16
+ ret i64 %res
+}
+
+define i64 @test_urem(i1 %c) {
+; CHECK-LABEL: @test_urem(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 0
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [3, %entry], [%iv.next, %loop]
+ %iv.next = urem i64 9, %iv
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, 4
+ ret i64 %res
+}
+
+define i64 @test_srem(i1 %c) {
+; CHECK-LABEL: @test_srem(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 16
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [-9, %entry], [%iv.next, %loop]
+ %iv.next = srem i64 %iv, -3
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, 16
+ ret i64 %res
+}
+
+define i64 @test_srem2(i1 %c) {
+; CHECK-LABEL: @test_srem2(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 16
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [-9, %entry], [%iv.next, %loop]
+ %iv.next = srem i64 %iv, 3
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, 16
+ ret i64 %res
+}
+
+define i64 @test_srem2_flipped(i1 %c) {
+; CHECK-LABEL: @test_srem2_flipped(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 0
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [3, %entry], [%iv.next, %loop]
+ %iv.next = srem i64 -9, %iv
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, 4
+ ret i64 %res
+}
+
+define i64 @test_srem3(i1 %c) {
+; CHECK-LABEL: @test_srem3(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 0
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [9, %entry], [%iv.next, %loop]
+ %iv.next = srem i64 %iv, -3
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, -16
+ ret i64 %res
+}
+
+define i64 @test_srem3_flipped(i1 %c) {
+; CHECK-LABEL: @test_srem3_flipped(
+; CHECK-NEXT: entry:
+; CHECK-NEXT: br label [[LOOP:%.*]]
+; CHECK: loop:
+; CHECK-NEXT: br i1 [[C:%.*]], label [[EXIT:%.*]], label [[LOOP]]
+; CHECK: exit:
+; CHECK-NEXT: ret i64 -4
+;
+entry:
+ br label %loop
+loop:
+ %iv = phi i64 [-3, %entry], [%iv.next, %loop]
+ %iv.next = srem i64 9, %iv
+ br i1 %c, label %exit, label %loop
+exit:
+ %res = and i64 %iv, -4
+ ret i64 %res
+}
+
define i64 @test_and(i1 %c) {
; CHECK-LABEL: @test_and(
; CHECK-NEXT: entry:
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
4a4d517
to
37f3777
Compare
Sorry, this is incorrect, and there are subtle problems with it. Is it possible to experiment with an Alive2 proof for this example? (I believe Alive2 doesn't handle loops) |
You can try with |
37f3777
to
756b8f9
Compare
As far as I can tell, there is no possible sdiv/srem recurrence simplification: with negative values, the result alternates between positive and negative values. I've now implemented just udiv/urem simplification, and I think it is simple enough that the patch should be good. |
756b8f9
to
cfeb596
Compare
cfeb596
to
03e2bf9
Compare
Gentle ping. |
1 similar comment
Gentle ping. |
03e2bf9
to
ce71274
Compare
Gentle ping. |
This patch is a very minor improvement, but an improvement nevertheless. I don't see it regressing compile-time, so it should be good, no? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this code be integrated in the shift handling above? It seems pretty similar -- but not quite the same, and I'm suspicious of differences :)
ce71274
to
c1df47e
Compare
The only difference is a smarter context. Given that the context is a little difficult to test, I've integrated the code at the cost of a worse context. We can try to improve the context in a follow-up, but I'm not sure what tests we can write. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some nits.
udiv and urem recurrences have the property that the result can never exceed the start value. Implement a simplification based on this property.
33b0f61
to
23acb5e
Compare
A urem recurrence has the property that the result can never exceed the start value. A udiv recurrence has the property that the result can never exceed either the start value or the numerator, whichever is greater. Implement a simplification based on these properties.