Calculate KnownBits from Metadata correctly for vector loads #128908

LU-JOHN · 2025-02-26T16:50:18Z

Calculate KnownBits correctly from metadata for vector loads.

llvmbot · 2025-02-26T16:50:53Z

@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-llvm-analysis

Author: None (LU-JOHN)

Changes

Calculate KnownBits correctly from metadata for vector loads.
Utilize KnownBits to reduce 64-bit shl to 32-bit shl.

Full diff: https://github.com/llvm/llvm-project/pull/128908.diff

2 Files Affected:

(modified) llvm/lib/Analysis/ValueTracking.cpp (+6-1)
(modified) llvm/test/CodeGen/AMDGPU/shl64_reduce.ll (+2-1)

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index e3e026f7979da..1d6368b408811 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -429,11 +429,16 @@ void llvm::computeKnownBitsFromRangeMetadata(const MDNode &Ranges,
     ConstantInt *Upper =
         mdconst::extract<ConstantInt>(Ranges.getOperand(2 * i + 1));
     ConstantRange Range(Lower->getValue(), Upper->getValue());
+    unsigned RangeBitWidth = Lower->getBitWidth();
+    // BitWidth > RangeBitWidth can happen if Known is set to the width of a
+    // vector load but Ranges describes a vector element.
+    assert(BitWidth >= RangeBitWidth);
 
     // The first CommonPrefixBits of all values in Range are equal.
     unsigned CommonPrefixBits =
         (Range.getUnsignedMax() ^ Range.getUnsignedMin()).countl_zero();
-    APInt Mask = APInt::getHighBitsSet(BitWidth, CommonPrefixBits);
+    APInt Mask = APInt::getBitsSet(BitWidth, RangeBitWidth - CommonPrefixBits,
+                                   RangeBitWidth);
     APInt UnsignedMax = Range.getUnsignedMax().zextOrTrunc(BitWidth);
     Known.One &= UnsignedMax & Mask;
     Known.Zero &= ~UnsignedMax & Mask;
diff --git a/llvm/test/CodeGen/AMDGPU/shl64_reduce.ll b/llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
index 69242f4e44840..3f8553180df30 100644
--- a/llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
+++ b/llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
@@ -79,8 +79,9 @@ define <2 x i64> @shl_v2_metadata(<2 x i64> %arg0, ptr %arg1.ptr) {
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; CHECK-NEXT:    flat_load_dwordx4 v[4:7], v[4:5]
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
-; CHECK-NEXT:    v_lshlrev_b64 v[0:1], v4, v[0:1]
 ; CHECK-NEXT:    v_lshlrev_b64 v[2:3], v6, v[2:3]
+; CHECK-NEXT:    v_lshlrev_b32_e32 v1, v4, v0
+; CHECK-NEXT:    v_mov_b32_e32 v0, 0
 ; CHECK-NEXT:    s_setpc_b64 s[30:31]
   %shift.amt = load <2 x i64>, ptr %arg1.ptr, !range !0, !noundef !{}
   %shl = shl <2 x i64> %arg0, %shift.amt

llvmbot · 2025-02-26T16:50:53Z

@llvm/pr-subscribers-backend-amdgpu

Author: None (LU-JOHN)

Changes

Calculate KnownBits correctly from metadata for vector loads.
Utilize KnownBits to reduce 64-bit shl to 32-bit shl.

Full diff: https://github.com/llvm/llvm-project/pull/128908.diff

2 Files Affected:

(modified) llvm/lib/Analysis/ValueTracking.cpp (+6-1)
(modified) llvm/test/CodeGen/AMDGPU/shl64_reduce.ll (+2-1)

diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index e3e026f7979da..1d6368b408811 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -429,11 +429,16 @@ void llvm::computeKnownBitsFromRangeMetadata(const MDNode &Ranges,
     ConstantInt *Upper =
         mdconst::extract<ConstantInt>(Ranges.getOperand(2 * i + 1));
     ConstantRange Range(Lower->getValue(), Upper->getValue());
+    unsigned RangeBitWidth = Lower->getBitWidth();
+    // BitWidth > RangeBitWidth can happen if Known is set to the width of a
+    // vector load but Ranges describes a vector element.
+    assert(BitWidth >= RangeBitWidth);
 
     // The first CommonPrefixBits of all values in Range are equal.
     unsigned CommonPrefixBits =
         (Range.getUnsignedMax() ^ Range.getUnsignedMin()).countl_zero();
-    APInt Mask = APInt::getHighBitsSet(BitWidth, CommonPrefixBits);
+    APInt Mask = APInt::getBitsSet(BitWidth, RangeBitWidth - CommonPrefixBits,
+                                   RangeBitWidth);
     APInt UnsignedMax = Range.getUnsignedMax().zextOrTrunc(BitWidth);
     Known.One &= UnsignedMax & Mask;
     Known.Zero &= ~UnsignedMax & Mask;
diff --git a/llvm/test/CodeGen/AMDGPU/shl64_reduce.ll b/llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
index 69242f4e44840..3f8553180df30 100644
--- a/llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
+++ b/llvm/test/CodeGen/AMDGPU/shl64_reduce.ll
@@ -79,8 +79,9 @@ define <2 x i64> @shl_v2_metadata(<2 x i64> %arg0, ptr %arg1.ptr) {
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; CHECK-NEXT:    flat_load_dwordx4 v[4:7], v[4:5]
 ; CHECK-NEXT:    s_waitcnt vmcnt(0) lgkmcnt(0)
-; CHECK-NEXT:    v_lshlrev_b64 v[0:1], v4, v[0:1]
 ; CHECK-NEXT:    v_lshlrev_b64 v[2:3], v6, v[2:3]
+; CHECK-NEXT:    v_lshlrev_b32_e32 v1, v4, v0
+; CHECK-NEXT:    v_mov_b32_e32 v0, 0
 ; CHECK-NEXT:    s_setpc_b64 s[30:31]
   %shift.amt = load <2 x i64>, ptr %arg1.ptr, !range !0, !noundef !{}
   %shl = shl <2 x i64> %arg0, %shift.amt

nikic

For vectors, the width of KnownBits should be the same as the width in the range metadata: Both are element-wise.

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

arsenm · 2025-03-10T03:14:50Z

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

      // MemoryVT().getSizeInBits()), which is truncated to the correct elt size
      // if it is know. These are then extended to the original VT sizes below.
      if (const MDNode *MD = LD->getRanges()) {
+        ConstantInt *Lower = mdconst::extract<ConstantInt>(MD->getOperand(0));
+
+        // FIXME: If loads are modified (e.g. type legalization)


You're adding the fixme, but this is what this patch is fixing?

You're adding the fixme, but this is what this patch is fixing?

This is not the right location for the FIXME. Type legalization and perhaps other DAG processing does not maintain the consistency between the range metadata type and the load type/width. This inconsistency causes problems here, but the fix should not be here. I'll remove the FIXME.

Signed-off-by: John Lu <[email protected]>

…rong place. Signed-off-by: John Lu <[email protected]>

llvm/lib/Analysis/ValueTracking.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

Signed-off-by: John Lu <[email protected]>

github-actions · 2025-03-12T15:26:12Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Signed-off-by: John Lu <[email protected]>

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

Signed-off-by: John Lu <[email protected]>

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

Signed-off-by: John Lu <[email protected]>

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

Signed-off-by: John Lu <[email protected]>

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

arsenm · 2025-03-25T02:19:16Z

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

+      if (const MDNode *MD = LD->getRanges()) {
+        ConstantInt *Lower = mdconst::extract<ConstantInt>(MD->getOperand(0));
+        if (Lower->getBitWidth() != NVT.getScalarSizeInBits())
+          LD->getMemOperand()->clearRanges();
+      }


In principle we should cast the range. But since this references IR metadata directly, this may imply mutating the IR which we should avoid doing

Is there a way to attach the range metadata to the subsequent bitcast to the original type?

Signed-off-by: John Lu <[email protected]>

durin42 · 2025-03-25T18:15:55Z

This is causing rustc to no longer be able to build its stage 2 compiler: instead it gets a SIGSEGV partway through. Search for SEGV in https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/35748#0195ce08-82ac-45e8-9161-ae441a1939bd - I'll try and do more debugging to figure out what's going on.

durin42 · 2025-03-25T20:22:23Z

I keep running out of disk space on my test machine (sigh) when I do debuginfo builds, but with assertions I got as far as src/llvm-project/llvm/include/llvm/ADT/APInt.h:1369: void llvm::APInt::setBits(unsigned int, unsigned int): Assertion `loBit <= BitWidth && "loBit out of range"' failed. Sadly no stack trace on that though.

LU-JOHN · 2025-03-25T20:44:41Z

I keep running out of disk space on my test machine (sigh) when I do debuginfo builds, but with assertions I got as far as src/llvm-project/llvm/include/llvm/ADT/APInt.h:1369: void llvm::APInt::setBits(unsigned int, unsigned int): Assertion `loBit <= BitWidth && "loBit out of range"' failed. Sadly no stack trace on that though.

I suspect that in the code:

  KnownBits KnownScalarMemory(ScalarMemorySize);
  if (const MDNode *MD = LD->getRanges())
    computeKnownBitsFromRangeMetadata(*MD, KnownScalarMemory);

the range metadata bitwidth does not == ScalarMemorySize when you see a crash. I thought that this should never happen, and I believe that the range metadata and load are incompatible if this does happen. Can you try changing the code to:

  if (const MDNode *MD = LD->getRanges()) {
    ConstantInt *Lower = mdconst::extract<ConstantInt>(MD->getOperand(0));
    if (Lower->getBitWidth() != ScalarMemorySize) {
      LD->dump();
      MD->dump();
      assert(0);
    }
    computeKnownBitsFromRangeMetadata(*MD, KnownScalarMemory);
  }

durin42 · 2025-03-25T21:19:28Z

That yielded

t24: v16i8,ch = load<(load (s128) from %ir.<badref>, !alias.scope <0x7fd6d812dc28>, !noalias <0x7fd6d812dc58>, !range <0x7fd6d80869d0>)> t0, t7, undef:i64
<0x7fd6d80869d0> = !{i128 0, i128 3}
rustc: /usr/local/google/home/augie/Programming/big/rust/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4016: llvm::KnownBits llvm::SelectionDAG::computeKnownBits(llvm::SDValue, const llvm::APInt&, unsigned int) const: Assertion `0' failed.

Hopefully that helps - I've also got things sorted so I can get the invocation open in lldb if that would help for me to inspect some stuff here.

LU-JOHN · 2025-03-25T21:39:09Z

That yielded

t24: v16i8,ch = load<(load (s128) from %ir.<badref>, !alias.scope <0x7fd6d812dc28>, !noalias <0x7fd6d812dc58>, !range <0x7fd6d80869d0>)> t0, t7, undef:i64
<0x7fd6d80869d0> = !{i128 0, i128 3}
rustc: /usr/local/google/home/augie/Programming/big/rust/src/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:4016: llvm::KnownBits llvm::SelectionDAG::computeKnownBits(llvm::SDValue, const llvm::APInt&, unsigned int) const: Assertion `0' failed.

Hopefully that helps - I've also got things sorted so I can get the invocation open in lldb if that would help for me to inspect some stuff here.

Thanks this helps. The range metadata has a 128-bit type, but the scalar type of the load is 8-bit. This is inconsistent. Can you send me a testcase so I can investigate how this inconsistent metadata was created?

durin42 · 2025-03-25T21:47:39Z

I'm honestly out of my depth here. My only path to reproducing this is building the Rust crate regex-automata and the resulting text IR is 42 megs, which I assume is way too much for you to look at usefully?

LU-JOHN · 2025-03-25T22:19:29Z

I'm honestly out of my depth here. My only path to reproducing this is building the Rust crate regex-automata and the resulting text IR is 42 megs, which I assume is way too much for you to look at usefully?

Can you send me instructions on how to build a rustc 2-stage compiler?

arsenm · 2025-03-25T22:54:47Z

I'm honestly out of my depth here. My only path to reproducing this is building the Rust crate regex-automata and the resulting text IR is 42 megs, which I assume is way too much for you to look at usefully?

It will easily shrink with llvm-reduce

TimNN · 2025-03-26T05:26:19Z

I was now also able to reproduce and got reduced IR, which crashes when run through llc -O0:

target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.12.0"

define void @"_ZN4core3ptr100drop_in_place$LT$core..option..Option$LT$regex_automata..meta..wrappers..ReverseHybridEngine$GT$$GT$17hb293fd3cea85facfE"() {
start:
  %0 = load i128, ptr null, align 16, !range !0, !noundef !1
  %1 = icmp eq i128 %0, 1
  br i1 %1, label %bb1, label %bb1

bb1:                                              ; preds = %start, %start
  ret void
}

!0 = !{i128 0, i128 3}
!1 = !{}

LU-JOHN · 2025-03-26T14:34:07Z

PR to fix this crash made in #133095

LU-JOHN requested a review from nikic as a code owner February 26, 2025 16:50

llvmbot added backend:AMDGPU llvm:analysis Includes value tracking, cost tables and constant folding labels Feb 26, 2025

nikic requested changes Feb 26, 2025

View reviewed changes

llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Mar 7, 2025

arsenm reviewed Mar 10, 2025

View reviewed changes

LU-JOHN added 3 commits March 11, 2025 10:31

Calculate KnownBits from Metadata correctly for vector loads

d8d5f02

Signed-off-by: John Lu <[email protected]>

Set KnownBits to correct width. Reduce 64-bit shl for all vector elts

8cc932e

Signed-off-by: John Lu <[email protected]>

Undo AMDGPU changes. Will be done in separate PR. Remove FIXME from w…

8b8949c

…rong place. Signed-off-by: John Lu <[email protected]>

LU-JOHN force-pushed the md_range branch from a71c8b2 to 8b8949c Compare March 11, 2025 18:14

LU-JOHN requested review from nikic and arsenm March 11, 2025 20:28

arsenm reviewed Mar 12, 2025

View reviewed changes

llvm/lib/Analysis/ValueTracking.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Outdated Show resolved Hide resolved

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Outdated Show resolved Hide resolved

LU-JOHN added 2 commits March 12, 2025 10:17

Use local KnownBits to get range metadata

a1a1a8b

Signed-off-by: John Lu <[email protected]>

Undo ValueTracking.cpp changes. Will be done in separate PR

f6ed89d

Signed-off-by: John Lu <[email protected]>

Fix formatting

2f36d8a

Signed-off-by: John Lu <[email protected]>

arsenm reviewed Mar 13, 2025

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Outdated Show resolved Hide resolved

Extend or truncate KnownMetadata to desired width

b7f4674

Signed-off-by: John Lu <[email protected]>

LU-JOHN force-pushed the md_range branch from 0428654 to b7f4674 Compare March 13, 2025 14:26

LU-JOHN requested a review from arsenm March 14, 2025 14:25

Test that range metadata is processed correctly for vector loads

883fe0c

Signed-off-by: John Lu <[email protected]>

arsenm reviewed Mar 17, 2025

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Outdated Show resolved Hide resolved

Compute KnownBits for scalar element. Remove unnecessary code.

73d6656

Signed-off-by: John Lu <[email protected]>

LU-JOHN requested a review from arsenm March 18, 2025 15:11

arsenm reviewed Mar 24, 2025

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Outdated Show resolved Hide resolved

Invalidate range metadata if legalization makes types incompatible

2c87382

Signed-off-by: John Lu <[email protected]>

LU-JOHN requested a review from arsenm March 25, 2025 01:54

arsenm reviewed Mar 25, 2025

View reviewed changes

arsenm approved these changes Mar 25, 2025

View reviewed changes

Invalidate range MD if legalized type is not integer

bbf8947

Signed-off-by: John Lu <[email protected]>

arsenm approved these changes Mar 25, 2025

View reviewed changes

arsenm merged commit 70aeb89 into llvm:main Mar 25, 2025
11 checks passed

This was referenced Mar 26, 2025

Invalidate range metadata when folding bitcast into load #133095

Merged

Ensure KnownBits passed when calculating from range md has right size #132985

Merged

Calculate KnownBits from Metadata correctly for vector loads #128908

Calculate KnownBits from Metadata correctly for vector loads #128908

Uh oh!

Conversation

LU-JOHN commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 26, 2025

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

arsenm Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

LU-JOHN Mar 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arsenm Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

LU-JOHN Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

durin42 commented Mar 25, 2025

Uh oh!

durin42 commented Mar 25, 2025

Uh oh!

LU-JOHN commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

durin42 commented Mar 25, 2025

Uh oh!

LU-JOHN commented Mar 25, 2025

Uh oh!

durin42 commented Mar 25, 2025

Uh oh!

LU-JOHN commented Mar 25, 2025

Uh oh!

arsenm commented Mar 25, 2025

Uh oh!

TimNN commented Mar 26, 2025

Uh oh!

LU-JOHN commented Mar 26, 2025

Uh oh!

Uh oh!

LU-JOHN commented Feb 26, 2025 •

edited

Loading

llvmbot commented Feb 26, 2025 •

edited

Loading

github-actions bot commented Mar 12, 2025 •

edited

Loading

LU-JOHN commented Mar 25, 2025 •

edited

Loading