Skip to content

[ValueTracking] Compute knownbits from (icmp upred (add/sub nuw X, Y), C) #87180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

goldsteinn
Copy link
Contributor

@goldsteinn goldsteinn commented Mar 31, 2024

  • [ValueTracking] Add tests for computing knownbits from (icmp upred (add/sub nuw X, Y), C); NFC

  • [ValueTracking] Compute knownbits from (icmp upred (add/sub nuw X, Y), C)

    (icmp ule/ult (add nuw X, Y), C) implies both (icmp ule/ult X, C) and
    (icmp ule/ult Y, C). We can use this to deduce leading zeros in X/Y.

    (icmp uge/ugt (sub nuw X, Y), C) implies (icmp uge/uge X, C) . We
    can use this to deduce leading ones in X.

    Proofs: https://alive2.llvm.org/ce/z/sc5k22

@goldsteinn goldsteinn requested a review from nikic as a code owner March 31, 2024 03:42
@goldsteinn goldsteinn changed the title goldsteinn/kb nuw add [ValueTracking] Compute knownbits from (icmp ult/ule (add nuw X, Y), C) Mar 31, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 31, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: None (goldsteinn)

Changes
  • [ValueTracking] Add tests for computing knownbits from (icmp ult/ule (add nuw X, Y), C); NFC
  • [ValueTracking] Compute knownbits from (icmp ult/ule (add nuw X, Y), C)

Based on: #87180


Full diff: https://github.com/llvm/llvm-project/pull/87180.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+28-12)
  • (modified) llvm/test/Transforms/InstCombine/known-bits.ll (+98-5)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index b5e8a1d22f264b..5bc2d605d3a1b1 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -698,13 +698,19 @@ static void computeKnownBitsFromCmp(const Value *V, CmpInst::Predicate Pred,
     break;
   }
   default:
-    const APInt *Offset = nullptr;
-    if (match(LHS, m_CombineOr(m_V, m_AddLike(m_V, m_APInt(Offset)))) &&
-        match(RHS, m_APInt(C))) {
-      ConstantRange LHSRange = ConstantRange::makeAllowedICmpRegion(Pred, *C);
-      if (Offset)
-        LHSRange = LHSRange.sub(*Offset);
-      Known = Known.unionWith(LHSRange.toKnownBits());
+    if (match(RHS, m_APInt(C))) {
+      const APInt *Offset = nullptr;
+      if (match(LHS, m_CombineOr(m_V, m_AddLike(m_V, m_APInt(Offset))))) {
+        ConstantRange LHSRange = ConstantRange::makeAllowedICmpRegion(Pred, *C);
+        if (Offset)
+          LHSRange = LHSRange.sub(*Offset);
+        Known = Known.unionWith(LHSRange.toKnownBits());
+      }
+      if ((Pred == ICmpInst::ICMP_ULE || Pred == ICmpInst::ICMP_ULT) &&
+          match(LHS, m_c_NUWAdd(m_V, m_Value()))) {
+        Known.Zero.setHighBits(
+            (*C - (Pred == ICmpInst::ICMP_ULT)).countLeadingZeros());
+      }
     }
     break;
   }
@@ -9283,11 +9289,21 @@ void llvm::findValuesAffectedByCondition(
             AddAffected(X);
         }
       } else {
-        // Handle (A + C1) u< C2, which is the canonical form of
-        // A > C3 && A < C4.
-        if (match(A, m_AddLike(m_Value(X), m_ConstantInt())) &&
-            match(B, m_ConstantInt()))
-          AddAffected(X);
+        if (match(B, m_ConstantInt())) {
+          // Handle (A + C1) u< C2, which is the canonical form of
+          // A > C3 && A < C4.
+          if (match(A, m_AddLike(m_Value(X), m_ConstantInt())))
+            AddAffected(X);
+
+          Value *Y;
+          // X & Y u> C -> X >u C && Y >u C
+          // X | Y u< C -> X u< C && Y u< C
+          if (ICmpInst::isUnsigned(Pred) &&
+              match(A, m_NUWAdd(m_Value(X), m_Value(Y)))) {
+            AddAffected(X);
+            AddAffected(Y);
+          }
+        }
 
         // Handle icmp slt/sgt (bitcast X to int), 0/-1, which is supported
         // by computeKnownFPClass().
diff --git a/llvm/test/Transforms/InstCombine/known-bits.ll b/llvm/test/Transforms/InstCombine/known-bits.ll
index 5305c78f691231..0ed63512f8cb49 100644
--- a/llvm/test/Transforms/InstCombine/known-bits.ll
+++ b/llvm/test/Transforms/InstCombine/known-bits.ll
@@ -124,7 +124,6 @@ exit:
   ret i8 %or2
 }
 
-
 define i8 @test_cond_and_bothways(i8 %x) {
 ; CHECK-LABEL: @test_cond_and_bothways(
 ; CHECK-NEXT:    [[AND:%.*]] = and i8 [[X:%.*]], 91
@@ -181,8 +180,6 @@ exit:
   ret i8 %or2
 }
 
-
-
 define i8 @test_cond_and_commuted(i8 %x, i1 %c1, i1 %c2) {
 ; CHECK-LABEL: @test_cond_and_commuted(
 ; CHECK-NEXT:    [[AND:%.*]] = and i8 [[X:%.*]], 3
@@ -343,7 +340,7 @@ exit:
   ret i8 %or2
 }
 
-define i32 @test_icmp_trunc1(i32 %x){
+define i32 @test_icmp_trunc1(i32 %x) {
 ; CHECK-LABEL: @test_icmp_trunc1(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[Y:%.*]] = trunc i32 [[X:%.*]] to i16
@@ -365,7 +362,7 @@ else:
   ret i32 0
 }
 
-define i32 @test_icmp_trunc_assume(i32 %x){
+define i32 @test_icmp_trunc_assume(i32 %x) {
 ; CHECK-LABEL: @test_icmp_trunc_assume(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[Y:%.*]] = trunc i32 [[X:%.*]] to i16
@@ -532,7 +529,103 @@ if.else:
   ret i1 %other
 }
 
+define i8 @test_icmp_add(i8 %n, i8 %n2, i8 %other) {
+; CHECK-LABEL: @test_icmp_add(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[N_ADD:%.*]] = add nuw i8 [[N:%.*]], [[N2:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i8 [[N_ADD]], 32
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    ret i8 0
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i8 [[OTHER:%.*]]
+;
+entry:
+  %n_add = add nuw i8 %n, %n2
+  %cmp = icmp ult i8 %n_add, 32
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+  %r = and i8 %n, 32
+  ret i8 %r
+
+if.else:
+  ret i8 %other
+}
+
+define i8 @test_icmp_add2(i8 %n, i8 %n2, i8 %other) {
+; CHECK-LABEL: @test_icmp_add2(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[N_ADD:%.*]] = add nuw i8 [[N:%.*]], [[N2:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i8 [[N_ADD]], 14
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    ret i8 [[OTHER:%.*]]
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i8 0
+;
+entry:
+  %n_add = add nuw i8 %n, %n2
+  %cmp = icmp uge i8 %n_add, 15
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+  ret i8 %other
+if.else:
+  %r = and i8 %n, 32
+  ret i8 %r
 
+}
+
+define i8 @test_icmp_add_fail_bad_range(i8 %n, i8 %n2, i8 %other) {
+; CHECK-LABEL: @test_icmp_add_fail_bad_range(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[N_ADD:%.*]] = add nuw i8 [[N:%.*]], [[N2:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ult i8 [[N_ADD]], 33
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    [[R:%.*]] = and i8 [[N]], 32
+; CHECK-NEXT:    ret i8 [[R]]
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i8 [[OTHER:%.*]]
+;
+entry:
+  %n_add = add nuw i8 %n, %n2
+  %cmp = icmp ule i8 %n_add, 32
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+  %r = and i8 %n, 32
+  ret i8 %r
+
+if.else:
+  ret i8 %other
+}
+
+define i8 @test_icmp_add_fail_bad_pred(i8 %n, i8 %n2, i8 %other) {
+; CHECK-LABEL: @test_icmp_add_fail_bad_pred(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[N_ADD:%.*]] = add nuw i8 [[N:%.*]], [[N2:%.*]]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp ugt i8 [[N_ADD]], 32
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK:       if.then:
+; CHECK-NEXT:    [[R:%.*]] = and i8 [[N]], 32
+; CHECK-NEXT:    ret i8 [[R]]
+; CHECK:       if.else:
+; CHECK-NEXT:    ret i8 [[OTHER:%.*]]
+;
+entry:
+  %n_add = add nuw i8 %n, %n2
+  %cmp = icmp ugt i8 %n_add, 32
+  br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+  %r = and i8 %n, 32
+  ret i8 %r
+
+if.else:
+  ret i8 %other
+}
 
 declare void @use(i1)
 declare void @sink(i8)

@goldsteinn goldsteinn requested a review from dtcxzyw March 31, 2024 03:43
@goldsteinn goldsteinn force-pushed the goldsteinn/kb-nuw-add branch from 0dd2548 to 0341e89 Compare March 31, 2024 16:01
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 31, 2024
dtcxzyw added a commit to dtcxzyw/llvm-opt-benchmark that referenced this pull request Mar 31, 2024
@dtcxzyw
Copy link
Member

dtcxzyw commented Apr 1, 2024

Can you have a look at the regression dtcxzyw/llvm-opt-benchmark#465 (comment)?

@goldsteinn
Copy link
Contributor Author

Can you have a look at the regression dtcxzyw/llvm-opt-benchmark#465 (comment)?

Okay, reduced test:

define void @foo(i32 %x, i32 %y) {
  %z = add nuw i32 %x, %y
  %cmp0 = icmp ult i32 %z, 255
  br i1 %cmp0, label %L0, label %L2

L0:
  %sub_x = sub i32 255, %x
  br label %L1

L1:
  %phi = phi i32 [ 0, %L0 ], [ 1, %L1 ]
  %cmp1 = icmp ult i32 %phi, %sub_x
  br i1 %cmp1, label %L1, label %L2

L2:
  ret void
}

The issue occurs due to the InstCombine fold at:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp#L2280-L2288

We produce:

Bad:

define void @foo(i32 %x, i32 %y) local_unnamed_addr {
  %z = add nuw i32 %x, %y
  %cmp0 = icmp ult i32 %z, 255
  br i1 %cmp0, label %L0, label %L2

L0:                                               ; preds = %0
  %sub_x = xor i32 %x, 255
  br label %L1

L1:                                               ; preds = %L1, %L0
  %phi = phi i32 [ 0, %L0 ], [ 1, %L1 ]
  %cmp1 = icmp ult i32 %phi, %sub_x
  br i1 %cmp1, label %L1, label %L2

L2:                                               ; preds = %L1, %0
  ret void
}

vs
Good:

define void @foo(i32 %x, i32 %y) local_unnamed_addr {
  %z = add nuw i32 %x, %y
  %cmp0 = icmp ult i32 %z, 255
  br i1 %cmp0, label %L0, label %L2

L0:                                               ; preds = %0
  %sub_x = sub i32 255, %x
  br label %L1

L1:                                               ; preds = %L1, %L0
  %phi = phi i32 [ 0, %L0 ], [ 1, %L1 ]
  %cmp1 = icmp ult i32 %phi, %sub_x
  br i1 %cmp1, label %L1, label %L2

L2:                                               ; preds = %L1, %0
  ret void
}

The issue then arises because we fold (icmp eq/ne (sub C0, x), C1) -> (icmp eq/ne x, C2) even if the sub has multi-use but we don't do the same for xor:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp#L3463-L3474

Ill add a patch for the xor case.

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Apr 1, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180
goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Apr 4, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180
@goldsteinn
Copy link
Contributor Author

My preference would be to not have this blocked by the missing multi-use xor fold. It seems pretty unrelated that we end up with that pattern w/ this patch but not w/ (in that one case) and handling multi-use xor seems independently valuable.

goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Apr 4, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180
@goldsteinn goldsteinn changed the title [ValueTracking] Compute knownbits from (icmp ult/ule (add nuw X, Y), C) [ValueTracking] Compute knownbits from (icmp upred (add/sub nuw X, Y), C) Apr 5, 2024
@goldsteinn goldsteinn force-pushed the goldsteinn/kb-nuw-add branch from 0341e89 to 6d6bb29 Compare April 5, 2024 01:55
@goldsteinn
Copy link
Contributor Author

@dtcxzyw i update with support for sub nuw as well.

@goldsteinn goldsteinn force-pushed the goldsteinn/kb-nuw-add branch from 6d6bb29 to 7d54e14 Compare April 11, 2024 18:07
@goldsteinn
Copy link
Contributor Author

rebased

@nikic
Copy link
Contributor

nikic commented Apr 25, 2024

Needs another rebase.

@dtcxzyw i update with support for sub nuw as well.

Why?

@goldsteinn
Copy link
Contributor Author

Needs another rebase.

@dtcxzyw i update with support for sub nuw as well.

Why?

I guess the add nuw mirrors the or case and the sub nuw mirrors the and case. I had meant to leave it as a folllowup todo but it seemed to fit well enough.

@goldsteinn goldsteinn force-pushed the goldsteinn/kb-nuw-add branch from 7d54e14 to 88fa1fa Compare April 25, 2024 19:00
@goldsteinn
Copy link
Contributor Author

ping

1 similar comment
@goldsteinn
Copy link
Contributor Author

ping

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


if.else:
ret i8 %other
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negative tests without nuw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done + rebased/

…), C)`

`(icmp ule/ult (add nuw X, Y), C)` implies both `(icmp ule/ult X, C)` and
`(icmp ule/ult Y, C)`. We can use this to deduce leading zeros in `X`/`Y`.

`(icmp uge/ugt (sub nuw X, Y), C)` implies `(icmp uge/uge X, C)` . We
can use this to deduce leading ones in `X`.

Proofs: https://alive2.llvm.org/ce/z/sc5k22
@goldsteinn goldsteinn force-pushed the goldsteinn/kb-nuw-add branch from 88fa1fa to 1adbd10 Compare May 16, 2024 17:25
goldsteinn added a commit to goldsteinn/llvm-project that referenced this pull request Jun 7, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180
goldsteinn added a commit that referenced this pull request Jun 7, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with #87180

Closes #87275
nekoshirro pushed a commit to nekoshirro/Alchemist-LLVM that referenced this pull request Jun 9, 2024
Two folds unlocked:
    `(icmp eq/ne (xor x, C0), C1)` -> `(icmp eq/ne x, C2)`
    `(icmp eq/ne (xor x, y), 0)` -> `(icmp eq/ne x, y)`

This fixes regressions assosiated with llvm#87180

Closes llvm#87275

Signed-off-by: Hafidz Muzakky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants