Skip to content

[SelectionDAG] Add support for extending masked loads in computeKnownBits #115450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 11, 2024

Conversation

david-arm
Copy link
Contributor

We already support computing known bits for extending loads, but not for masked loads. For now I've only added support for zero-extends because that's the only thing currently tested. Even when the passthru value is poison we still know the top X bits are zero.

…Bits

We already support computing known bits for extending loads, but not
for masked loads. For now I've only added support for zero-extends
because that's the only thing currently tested. Even when the passthru
value is poison we still know the top X bits are zero.
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-llvm-selectiondag

Author: David Sherwood (david-arm)

Changes

We already support computing known bits for extending loads, but not for masked loads. For now I've only added support for zero-extends because that's the only thing currently tested. Even when the passthru value is poison we still know the top X bits are zero.


Full diff: https://github.com/llvm/llvm-project/pull/115450.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+13)
  • (modified) llvm/test/CodeGen/AArch64/sve-hadd.ll (+6-8)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 203e14f6cde3e3..901e63c47fac17 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -3920,6 +3920,19 @@ KnownBits SelectionDAG::computeKnownBits(SDValue Op, const APInt &DemandedElts,
     Known.Zero.setBitsFrom(1);
     break;
   }
+  case ISD::MGATHER:
+  case ISD::MLOAD: {
+    ISD::LoadExtType ETy =
+        (Opcode == ISD::MGATHER)
+            ? cast<MaskedGatherSDNode>(Op)->getExtensionType()
+            : cast<MaskedLoadSDNode>(Op)->getExtensionType();
+    if (ETy == ISD::ZEXTLOAD) {
+      EVT MemVT = cast<MemSDNode>(Op)->getMemoryVT();
+      KnownBits Known0(MemVT.getScalarSizeInBits());
+      return Known0.zext(BitWidth);
+    }
+    break;
+  }
   case ISD::LOAD: {
     LoadSDNode *LD = cast<LoadSDNode>(Op);
     const Constant *Cst = TLI->getTargetConstantFromLoad(LD);
diff --git a/llvm/test/CodeGen/AArch64/sve-hadd.ll b/llvm/test/CodeGen/AArch64/sve-hadd.ll
index 857a883d80ea3d..978ee4534e5e1a 100644
--- a/llvm/test/CodeGen/AArch64/sve-hadd.ll
+++ b/llvm/test/CodeGen/AArch64/sve-hadd.ll
@@ -1347,10 +1347,8 @@ define void @zext_mload_avgflooru(ptr %p1, ptr %p2, <vscale x 8 x i1> %mask) {
 ; SVE:       // %bb.0:
 ; SVE-NEXT:    ld1b { z0.h }, p0/z, [x0]
 ; SVE-NEXT:    ld1b { z1.h }, p0/z, [x1]
-; SVE-NEXT:    eor z2.d, z0.d, z1.d
-; SVE-NEXT:    and z0.d, z0.d, z1.d
-; SVE-NEXT:    lsr z1.h, z2.h, #1
 ; SVE-NEXT:    add z0.h, z0.h, z1.h
+; SVE-NEXT:    lsr z0.h, z0.h, #1
 ; SVE-NEXT:    st1h { z0.h }, p0, [x0]
 ; SVE-NEXT:    ret
 ;
@@ -1377,11 +1375,11 @@ define void @zext_mload_avgceilu(ptr %p1, ptr %p2, <vscale x 8 x i1> %mask) {
 ; SVE-LABEL: zext_mload_avgceilu:
 ; SVE:       // %bb.0:
 ; SVE-NEXT:    ld1b { z0.h }, p0/z, [x0]
-; SVE-NEXT:    ld1b { z1.h }, p0/z, [x1]
-; SVE-NEXT:    eor z2.d, z0.d, z1.d
-; SVE-NEXT:    orr z0.d, z0.d, z1.d
-; SVE-NEXT:    lsr z1.h, z2.h, #1
-; SVE-NEXT:    sub z0.h, z0.h, z1.h
+; SVE-NEXT:    mov z1.h, #-1 // =0xffffffffffffffff
+; SVE-NEXT:    ld1b { z2.h }, p0/z, [x1]
+; SVE-NEXT:    eor z0.d, z0.d, z1.d
+; SVE-NEXT:    sub z0.h, z2.h, z0.h
+; SVE-NEXT:    lsr z0.h, z0.h, #1
 ; SVE-NEXT:    st1b { z0.h }, p0, [x0]
 ; SVE-NEXT:    ret
 ;

@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-backend-aarch64

Author: David Sherwood (david-arm)

Changes

We already support computing known bits for extending loads, but not for masked loads. For now I've only added support for zero-extends because that's the only thing currently tested. Even when the passthru value is poison we still know the top X bits are zero.


Full diff: https://github.com/llvm/llvm-project/pull/115450.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+13)
  • (modified) llvm/test/CodeGen/AArch64/sve-hadd.ll (+6-8)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 203e14f6cde3e3..901e63c47fac17 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -3920,6 +3920,19 @@ KnownBits SelectionDAG::computeKnownBits(SDValue Op, const APInt &DemandedElts,
     Known.Zero.setBitsFrom(1);
     break;
   }
+  case ISD::MGATHER:
+  case ISD::MLOAD: {
+    ISD::LoadExtType ETy =
+        (Opcode == ISD::MGATHER)
+            ? cast<MaskedGatherSDNode>(Op)->getExtensionType()
+            : cast<MaskedLoadSDNode>(Op)->getExtensionType();
+    if (ETy == ISD::ZEXTLOAD) {
+      EVT MemVT = cast<MemSDNode>(Op)->getMemoryVT();
+      KnownBits Known0(MemVT.getScalarSizeInBits());
+      return Known0.zext(BitWidth);
+    }
+    break;
+  }
   case ISD::LOAD: {
     LoadSDNode *LD = cast<LoadSDNode>(Op);
     const Constant *Cst = TLI->getTargetConstantFromLoad(LD);
diff --git a/llvm/test/CodeGen/AArch64/sve-hadd.ll b/llvm/test/CodeGen/AArch64/sve-hadd.ll
index 857a883d80ea3d..978ee4534e5e1a 100644
--- a/llvm/test/CodeGen/AArch64/sve-hadd.ll
+++ b/llvm/test/CodeGen/AArch64/sve-hadd.ll
@@ -1347,10 +1347,8 @@ define void @zext_mload_avgflooru(ptr %p1, ptr %p2, <vscale x 8 x i1> %mask) {
 ; SVE:       // %bb.0:
 ; SVE-NEXT:    ld1b { z0.h }, p0/z, [x0]
 ; SVE-NEXT:    ld1b { z1.h }, p0/z, [x1]
-; SVE-NEXT:    eor z2.d, z0.d, z1.d
-; SVE-NEXT:    and z0.d, z0.d, z1.d
-; SVE-NEXT:    lsr z1.h, z2.h, #1
 ; SVE-NEXT:    add z0.h, z0.h, z1.h
+; SVE-NEXT:    lsr z0.h, z0.h, #1
 ; SVE-NEXT:    st1h { z0.h }, p0, [x0]
 ; SVE-NEXT:    ret
 ;
@@ -1377,11 +1375,11 @@ define void @zext_mload_avgceilu(ptr %p1, ptr %p2, <vscale x 8 x i1> %mask) {
 ; SVE-LABEL: zext_mload_avgceilu:
 ; SVE:       // %bb.0:
 ; SVE-NEXT:    ld1b { z0.h }, p0/z, [x0]
-; SVE-NEXT:    ld1b { z1.h }, p0/z, [x1]
-; SVE-NEXT:    eor z2.d, z0.d, z1.d
-; SVE-NEXT:    orr z0.d, z0.d, z1.d
-; SVE-NEXT:    lsr z1.h, z2.h, #1
-; SVE-NEXT:    sub z0.h, z0.h, z1.h
+; SVE-NEXT:    mov z1.h, #-1 // =0xffffffffffffffff
+; SVE-NEXT:    ld1b { z2.h }, p0/z, [x1]
+; SVE-NEXT:    eor z0.d, z0.d, z1.d
+; SVE-NEXT:    sub z0.h, z2.h, z0.h
+; SVE-NEXT:    lsr z0.h, z0.h, #1
 ; SVE-NEXT:    st1b { z0.h }, p0, [x0]
 ; SVE-NEXT:    ret
 ;

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@paulwalker-arm paulwalker-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned on #115142, I don't think the existing handling (or documentation) of the passthru parameter is strict enough to reinforce the assumption in the commit message. I do agree with your interpretation though.

@david-arm david-arm merged commit 69b39e7 into llvm:main Nov 11, 2024
11 checks passed
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
…Bits (llvm#115450)

We already support computing known bits for extending loads, but not for
masked loads. For now I've only added support for zero-extends because
that's the only thing currently tested. Even when the passthru value is
poison we still know the top X bits are zero.
@david-arm david-arm deleted the mload_known_bits branch January 28, 2025 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants