Skip to content

[ValueTracking] Fix bit width handling in computeKnownBits() for GEPs #125532

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 4, 2025

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Feb 3, 2025

For GEPs, we have three bit widths involved: The pointer bit width, the index bit width, and the bit width of the GEP operands.

The correct behavior here is:

  • We need to sextOrTrunc the GEP operand to the index width before multiplying by the scale.
  • If the index width and pointer width differ, GEP only ever modifies the low bits. Adds should not overflow into the high bits.

I'm testing this via unit tests because it's a bit tricky to test in IR with InstCombine canonicalization getting in the way.

For GEPs, we have three bit widths involved: The pointer bit width,
the index bit width, and the bit width of the GEP operands.

The correct behavior here is:
* We need to sextOrTrunc the GEP operand to the index width *before*
  multiplying by the scale.
* If the index width and pointer width differ, GEP only ever
  modifies the low bits. Adds should not overflow into the high
  bits.

I'm testing this via unit tests because it's a bit tricky to test
in IR with InstCombine canonicalization getting in the way.
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Feb 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 3, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Nikita Popov (nikic)

Changes

For GEPs, we have three bit widths involved: The pointer bit width, the index bit width, and the bit width of the GEP operands.

The correct behavior here is:

  • We need to sextOrTrunc the GEP operand to the index width before multiplying by the scale.
  • If the index width and pointer width differ, GEP only ever modifies the low bits. Adds should not overflow into the high bits.

I'm testing this via unit tests because it's a bit tricky to test in IR with InstCombine canonicalization getting in the way.


Full diff: https://github.com/llvm/llvm-project/pull/125532.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ValueTracking.cpp (+36-30)
  • (modified) llvm/unittests/Analysis/ValueTrackingTest.cpp (+6-6)
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 6b61a3546e8b7c..61a14f471e88ff 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -1426,7 +1426,22 @@ static void computeKnownBitsFromOperator(const Operator *I,
     computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);
     // Accumulate the constant indices in a separate variable
     // to minimize the number of calls to computeForAddSub.
-    APInt AccConstIndices(BitWidth, 0, /*IsSigned*/ true);
+    unsigned IndexWidth = Q.DL.getIndexTypeSizeInBits(I->getType());
+    APInt AccConstIndices(IndexWidth, 0);
+
+    auto AddIndex = [&](KnownBits IndexBits) {
+      if (IndexWidth == BitWidth) {
+        // Note that inbounds does *not* guarantee nsw for the addition, as only
+        // the offset is signed, while the base address is unsigned.
+        Known = KnownBits::add(Known, IndexBits);
+      } else {
+        // If the index width is smaller than the pointer width, only add the
+        // value to the low bits.
+        assert(IndexWidth < BitWidth &&
+               "Index width can't be larger than pointer width");
+        Known.insertBits(KnownBits::add(Known.trunc(IndexWidth), IndexBits), 0);
+      }
+    };
 
     gep_type_iterator GTI = gep_type_begin(I);
     for (unsigned i = 1, e = I->getNumOperands(); i != e; ++i, ++GTI) {
@@ -1464,43 +1479,34 @@ static void computeKnownBitsFromOperator(const Operator *I,
         break;
       }
 
-      unsigned IndexBitWidth = Index->getType()->getScalarSizeInBits();
-      KnownBits IndexBits(IndexBitWidth);
-      computeKnownBits(Index, IndexBits, Depth + 1, Q);
-      TypeSize IndexTypeSize = GTI.getSequentialElementStride(Q.DL);
-      uint64_t TypeSizeInBytes = IndexTypeSize.getKnownMinValue();
-      KnownBits ScalingFactor(IndexBitWidth);
+      TypeSize Stride = GTI.getSequentialElementStride(Q.DL);
+      uint64_t StrideInBytes = Stride.getKnownMinValue();
+      if (!Stride.isScalable()) {
+        // Fast path for constant offset.
+        if (auto *CI = dyn_cast<ConstantInt>(Index)) {
+          AccConstIndices +=
+              CI->getValue().sextOrTrunc(IndexWidth) * StrideInBytes;
+          continue;
+        }
+      }
+
+      KnownBits IndexBits =
+          computeKnownBits(Index, Depth + 1, Q).sextOrTrunc(IndexWidth);
+      KnownBits ScalingFactor(IndexWidth);
       // Multiply by current sizeof type.
       // &A[i] == A + i * sizeof(*A[i]).
-      if (IndexTypeSize.isScalable()) {
+      if (Stride.isScalable()) {
         // For scalable types the only thing we know about sizeof is
         // that this is a multiple of the minimum size.
-        ScalingFactor.Zero.setLowBits(llvm::countr_zero(TypeSizeInBytes));
-      } else if (IndexBits.isConstant()) {
-        APInt IndexConst = IndexBits.getConstant();
-        APInt ScalingFactor(IndexBitWidth, TypeSizeInBytes);
-        IndexConst *= ScalingFactor;
-        AccConstIndices += IndexConst.sextOrTrunc(BitWidth);
-        continue;
+        ScalingFactor.Zero.setLowBits(llvm::countr_zero(StrideInBytes));
       } else {
         ScalingFactor =
-            KnownBits::makeConstant(APInt(IndexBitWidth, TypeSizeInBytes));
+            KnownBits::makeConstant(APInt(IndexWidth, StrideInBytes));
       }
-      IndexBits = KnownBits::mul(IndexBits, ScalingFactor);
-
-      // If the offsets have a different width from the pointer, according
-      // to the language reference we need to sign-extend or truncate them
-      // to the width of the pointer.
-      IndexBits = IndexBits.sextOrTrunc(BitWidth);
-
-      // Note that inbounds does *not* guarantee nsw for the addition, as only
-      // the offset is signed, while the base address is unsigned.
-      Known = KnownBits::add(Known, IndexBits);
-    }
-    if (!Known.isUnknown() && !AccConstIndices.isZero()) {
-      KnownBits Index = KnownBits::makeConstant(AccConstIndices);
-      Known = KnownBits::add(Known, Index);
+      AddIndex(KnownBits::mul(IndexBits, ScalingFactor));
     }
+    if (!Known.isUnknown() && !AccConstIndices.isZero())
+      AddIndex(KnownBits::makeConstant(AccConstIndices));
     break;
   }
   case Instruction::PHI: {
diff --git a/llvm/unittests/Analysis/ValueTrackingTest.cpp b/llvm/unittests/Analysis/ValueTrackingTest.cpp
index 39865fa195cf75..50e5e0e6b2ff5b 100644
--- a/llvm/unittests/Analysis/ValueTrackingTest.cpp
+++ b/llvm/unittests/Analysis/ValueTrackingTest.cpp
@@ -2680,7 +2680,7 @@ TEST_F(ComputeKnownBitsTest, ComputeKnownBitsAbsoluteSymbol) {
 }
 
 TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPExtendBeforeMul) {
-  // FIXME: The index should be extended before multiplying with the scale.
+  // The index should be extended before multiplying with the scale.
   parseAssembly(R"(
     target datalayout = "p:16:16:16"
 
@@ -2692,12 +2692,12 @@ TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPExtendBeforeMul) {
     }
     )");
   KnownBits Known = computeKnownBits(A, M->getDataLayout());
-  EXPECT_EQ(~64 & 0x7fff, Known.Zero);
-  EXPECT_EQ(64, Known.One);
+  EXPECT_EQ(~320 & 0x7fff, Known.Zero);
+  EXPECT_EQ(320, Known.One);
 }
 
 TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPOnlyIndexBits) {
-  // FIXME: GEP should only affect the index width.
+  // GEP should only affect the index width.
   parseAssembly(R"(
     target datalayout = "p:16:16:16:8"
 
@@ -2710,8 +2710,8 @@ TEST_F(ComputeKnownBitsTest, ComputeKnownBitsGEPOnlyIndexBits) {
     }
     )");
   KnownBits Known = computeKnownBits(A, M->getDataLayout());
-  EXPECT_EQ(0x7eff, Known.Zero);
-  EXPECT_EQ(0x100, Known.One);
+  EXPECT_EQ(0x7fff, Known.Zero);
+  EXPECT_EQ(0, Known.One);
 }
 
 TEST_F(ValueTrackingTest, HaveNoCommonBitsSet) {

"Index width can't be larger than pointer width");
Known.insertBits(KnownBits::add(Known.trunc(IndexWidth), IndexBits), 0);
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe rename to AddIndexToKnown, or return Known (and set Known = AddIndex(...)) were its used below for the sake of clarity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (IndexWidth == BitWidth) {
// Note that inbounds does *not* guarantee nsw for the addition, as only
// the offset is signed, while the base address is unsigned.
Known = KnownBits::add(Known, IndexBits);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beyond the scope of this PR, but would it be slightly more optimal to start Known as zero so we could include nsw here (and nuw on the final addition) if we have inbounds?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot preserve nsw here because we changed the order of pointer addition (line 1485).

if (!Known.isUnknown() && !AccConstIndices.isZero()) {
KnownBits Index = KnownBits::makeConstant(AccConstIndices);
Known = KnownBits::add(Known, Index);
AddIndex(KnownBits::mul(IndexBits, ScalingFactor));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be nsw I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could preserve nuw/nsw here, but KnownBits::mul() doesn't currently support that. It probably wouldn't be useful in this context anyway.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nikic nikic merged commit 3bd11b5 into llvm:main Feb 4, 2025
8 checks passed
@nikic nikic deleted the knownbits-gep-width branch February 4, 2025 13:30
@nikic nikic added this to the LLVM 20.X Release milestone Feb 10, 2025
@nikic
Copy link
Contributor Author

nikic commented Feb 10, 2025

/cherry-pick 3dc1ef1 3bd11b5

@llvmbot
Copy link
Member

llvmbot commented Feb 10, 2025

/pull-request #126496

swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Feb 11, 2025
…llvm#125532)

For GEPs, we have three bit widths involved: The pointer bit width, the
index bit width, and the bit width of the GEP operands.

The correct behavior here is:
* We need to sextOrTrunc the GEP operand to the index width *before*
multiplying by the scale.
* If the index width and pointer width differ, GEP only ever modifies
the low bits. Adds should not overflow into the high bits.

I'm testing this via unit tests because it's a bit tricky to test in IR
with InstCombine canonicalization getting in the way.

(cherry picked from commit 3bd11b5)
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…llvm#125532)

For GEPs, we have three bit widths involved: The pointer bit width, the
index bit width, and the bit width of the GEP operands.

The correct behavior here is:
* We need to sextOrTrunc the GEP operand to the index width *before*
multiplying by the scale.
* If the index width and pointer width differ, GEP only ever modifies
the low bits. Adds should not overflow into the high bits.

I'm testing this via unit tests because it's a bit tricky to test in IR
with InstCombine canonicalization getting in the way.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
Development

Successfully merging this pull request may close these issues.

4 participants