Skip to content

DAG: Fix legalization of vector addrspacecasts #113964

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 29, 2024

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Oct 28, 2024

No description provided.

@arsenm arsenm added llvm:SelectionDAG SelectionDAGISel as well backend:AMDGPU labels Oct 28, 2024 — with Graphite App
Copy link
Contributor Author

arsenm commented Oct 28, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @arsenm and the rest of your teammates on Graphite Graphite

@llvmbot
Copy link
Member

llvmbot commented Oct 28, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Patch is 53.57 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113964.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+8)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/addrspacecast.ll (+1177)
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index e0a03383358b76..ab12a9222fa6d4 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -4364,6 +4364,9 @@ bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
     Results.push_back(DAG.getNode(ISD::FP_TO_SINT, dl, ResVT, RoundNode));
     break;
   }
+  case ISD::ADDRSPACECAST:
+    Results.push_back(DAG.UnrollVectorOp(Node));
+    break;
   case ISD::GLOBAL_OFFSET_TABLE:
   case ISD::GlobalAddress:
   case ISD::GlobalTLSAddress:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 1a86b3b51234d1..c1b55800a1c7a7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -12583,6 +12583,14 @@ SDValue SelectionDAG::UnrollVectorOp(SDNode *N, unsigned ResNE) {
       Scalars.push_back(getNode(N->getOpcode(), dl, EltVT,
                                 Operands[0],
                                 getValueType(ExtVT)));
+      break;
+    }
+    case ISD::ADDRSPACECAST: {
+      const auto *ASC = cast<AddrSpaceCastSDNode>(N);
+      Scalars.push_back(getAddrSpaceCast(dl, EltVT, Operands[0],
+                                         ASC->getSrcAddressSpace(),
+                                         ASC->getDestAddressSpace()));
+      break;
     }
     }
   }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 0f65df0763cc83..e4b54c7d72b083 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -512,18 +512,18 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
 
   for (MVT VT : VectorIntTypes) {
     // Expand the following operations for the current type by default.
-    setOperationAction({ISD::ADD,        ISD::AND,     ISD::FP_TO_SINT,
-                        ISD::FP_TO_UINT, ISD::MUL,     ISD::MULHU,
-                        ISD::MULHS,      ISD::OR,      ISD::SHL,
-                        ISD::SRA,        ISD::SRL,     ISD::ROTL,
-                        ISD::ROTR,       ISD::SUB,     ISD::SINT_TO_FP,
-                        ISD::UINT_TO_FP, ISD::SDIV,    ISD::UDIV,
-                        ISD::SREM,       ISD::UREM,    ISD::SMUL_LOHI,
-                        ISD::UMUL_LOHI,  ISD::SDIVREM, ISD::UDIVREM,
-                        ISD::SELECT,     ISD::VSELECT, ISD::SELECT_CC,
-                        ISD::XOR,        ISD::BSWAP,   ISD::CTPOP,
-                        ISD::CTTZ,       ISD::CTLZ,    ISD::VECTOR_SHUFFLE,
-                        ISD::SETCC},
+    setOperationAction({ISD::ADD,        ISD::AND,          ISD::FP_TO_SINT,
+                        ISD::FP_TO_UINT, ISD::MUL,          ISD::MULHU,
+                        ISD::MULHS,      ISD::OR,           ISD::SHL,
+                        ISD::SRA,        ISD::SRL,          ISD::ROTL,
+                        ISD::ROTR,       ISD::SUB,          ISD::SINT_TO_FP,
+                        ISD::UINT_TO_FP, ISD::SDIV,         ISD::UDIV,
+                        ISD::SREM,       ISD::UREM,         ISD::SMUL_LOHI,
+                        ISD::UMUL_LOHI,  ISD::SDIVREM,      ISD::UDIVREM,
+                        ISD::SELECT,     ISD::VSELECT,      ISD::SELECT_CC,
+                        ISD::XOR,        ISD::BSWAP,        ISD::CTPOP,
+                        ISD::CTTZ,       ISD::CTLZ,         ISD::VECTOR_SHUFFLE,
+                        ISD::SETCC,      ISD::ADDRSPACECAST},
                        VT, Expand);
   }
 
diff --git a/llvm/test/CodeGen/AMDGPU/addrspacecast.ll b/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
index 7336543b41cbc8..236956c1829e77 100644
--- a/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
+++ b/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
@@ -409,6 +409,1183 @@ define amdgpu_kernel void @use_constant32bit_to_flat_addrspacecast_1(ptr addrspa
   ret void
 }
 
+define <2 x ptr addrspace(5)> @addrspacecast_v2p0_to_v2p5(<2 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v2p0_to_v2p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <2 x ptr> %ptr to <2 x ptr addrspace(5)>
+  ret <2 x ptr addrspace(5)> %cast
+}
+
+define <3 x ptr addrspace(5)> @addrspacecast_v3p0_to_v3p5(<3 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v3p0_to_v3p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[4:5]
+; HSA-NEXT:    v_cndmask_b32_e32 v2, -1, v4, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <3 x ptr> %ptr to <3 x ptr addrspace(5)>
+  ret <3 x ptr addrspace(5)> %cast
+}
+
+define <4 x ptr addrspace(5)> @addrspacecast_v4p0_to_v4p5(<4 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v4p0_to_v4p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[4:5]
+; HSA-NEXT:    v_cndmask_b32_e32 v2, -1, v4, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[6:7]
+; HSA-NEXT:    v_cndmask_b32_e32 v3, -1, v6, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <4 x ptr> %ptr to <4 x ptr addrspace(5)>
+  ret <4 x ptr addrspace(5)> %cast
+}
+
+define <8 x ptr addrspace(5)> @addrspacecast_v8p0_to_v8p5(<8 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v8p0_to_v8p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[4:5]
+; HSA-NEXT:    v_cndmask_b32_e32 v2, -1, v4, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[6:7]
+; HSA-NEXT:    v_cndmask_b32_e32 v3, -1, v6, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[8:9]
+; HSA-NEXT:    v_cndmask_b32_e32 v4, -1, v8, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[10:11]
+; HSA-NEXT:    v_cndmask_b32_e32 v5, -1, v10, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[12:13]
+; HSA-NEXT:    v_cndmask_b32_e32 v6, -1, v12, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[14:15]
+; HSA-NEXT:    v_cndmask_b32_e32 v7, -1, v14, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <8 x ptr> %ptr to <8 x ptr addrspace(5)>
+  ret <8 x ptr addrspace(5)> %cast
+}
+
+define <16 x ptr addrspace(5)> @addrspacecast_v16p0_to_v16p5(<16 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v16p0_to_v16p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    buffer_load_dword v31, off, s[0:3], s32
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cmp_ne_u64_e64 s[4:5], 0, v[24:25]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cmp_ne_u64_e64 s[6:7], 0, v[26:27]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[4:5]
+; HSA-NEXT:    v_cmp_ne_u64_e64 s[8:9], 0, v[28:29]
+; HSA-NEXT:    v_cndmask_b32_e32 v2, -1, v4, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[6:7]
+; HSA-NEXT:    v_cndmask_b32_e32 v3, -1, v6, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[8:9]
+; HSA-NEXT:    v_cndmask_b32_e32 v4, -1, v8, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[10:11]
+; HSA-NEXT:    v_cndmask_b32_e32 v5, -1, v10, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[12:13]
+; HSA-NEXT:    v_cndmask_b32_e64 v13, -1, v26, s[6:7]
+; HSA-NEXT:    v_cndmask_b32_e32 v6, -1, v12, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[14:15]
+; HSA-NEXT:    v_cndmask_b32_e64 v12, -1, v24, s[4:5]
+; HSA-NEXT:    v_cndmask_b32_e32 v7, -1, v14, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[16:17]
+; HSA-NEXT:    v_cndmask_b32_e64 v14, -1, v28, s[8:9]
+; HSA-NEXT:    v_cndmask_b32_e32 v8, -1, v16, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[18:19]
+; HSA-NEXT:    v_cndmask_b32_e32 v9, -1, v18, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[20:21]
+; HSA-NEXT:    v_cndmask_b32_e32 v10, -1, v20, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[22:23]
+; HSA-NEXT:    v_cndmask_b32_e32 v11, -1, v22, vcc
+; HSA-NEXT:    s_waitcnt vmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[30:31]
+; HSA-NEXT:    v_cndmask_b32_e32 v15, -1, v30, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <16 x ptr> %ptr to <16 x ptr addrspace(5)>
+  ret <16 x ptr addrspace(5)> %cast
+}
+
+define <2 x ptr> @addrspacecast_v2p5_to_v2p0(<2 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v2p5_to_v2p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v3, s4
+; CI-NEXT:    v_cndmask_b32_e32 v4, 0, v3, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v2, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v3, 0, v3, vcc
+; CI-NEXT:    v_mov_b32_e32 v1, v4
+; CI-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: addrspacecast_v2p5_to_v2p0:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    s_mov_b64 s[4:5], src_private_base
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; GFX9-NEXT:    v_mov_b32_e32 v3, s5
+; GFX9-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v4, 0, v3, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; GFX9-NEXT:    v_cndmask_b32_e32 v2, 0, v1, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v3, 0, v3, vcc
+; GFX9-NEXT:    v_mov_b32_e32 v1, v4
+; GFX9-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <2 x ptr addrspace(5)> %ptr to <2 x ptr>
+  ret <2 x ptr> %cast
+}
+
+define <3 x ptr> @addrspacecast_v3p5_to_v3p0(<3 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v3p5_to_v3p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v5, s4
+; CI-NEXT:    v_cndmask_b32_e32 v7, 0, v5, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v6, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v3, 0, v5, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; CI-NEXT:    v_cndmask_b32_e32 v4, 0, v2, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v5, 0, v5, vcc
+; CI-NEXT:    v_mov_b32_e32 v1, v7
+; CI-NEXT:    v_mov_b32_e32 v2, v6
+; CI-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: addrspacecast_v3p5_to_v3p0:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    s_mov_b64 s[4:5], src_private_base
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; GFX9-NEXT:    v_mov_b32_e32 v5, s5
+; GFX9-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v7, 0, v5, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; GFX9-NEXT:    v_cndmask_b32_e32 v6, 0, v1, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v3, 0, v5, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; GFX9-NEXT:    v_cndmask_b32_e32 v4, 0, v2, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v5, 0, v5, vcc
+; GFX9-NEXT:    v_mov_b32_e32 v1, v7
+; GFX9-NEXT:    v_mov_b32_e32 v2, v6
+; GFX9-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <3 x ptr addrspace(5)> %ptr to <3 x ptr>
+  ret <3 x ptr> %cast
+}
+
+define <4 x ptr> @addrspacecast_v4p5_to_v4p0(<4 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v4p5_to_v4p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v7, s4
+; CI-NEXT:    v_cndmask_b32_e32 v10, 0, v7, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v8, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v9, 0, v7, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; CI-NEXT:    v_cndmask_b32_e32 v4, 0, v2, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v5, 0, v7, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; CI-NEXT:    v_cndmask_b32_e32 v6, 0, v3, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v7, 0, v7, vcc
+; CI-NEXT:    v_mov_b32_e32 v1, v10
+; CI-NEXT:    v_mov_b32_e32 v2, v8
+; CI-NEXT:    v_mov_b32_e32 v3, v9
+; CI-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: addrspacecast_v4p5_to_v4p0:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    s_mov_b64 s[4:5], src_private_base
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; GFX9-NEXT:    v_mov_b32_e32 v7, s5
+; GFX9-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v10, 0, v7, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; GFX9-NEXT:    v_cndmask_b32_e32 v8, 0, v1, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v9, 0, v7, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; GFX9-NEXT:    v_cndmask_b32_e32 v4, 0, v2, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v5, 0, v7, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; GFX9-NEXT:    v_cndmask_b32_e32 v6, 0, v3, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v7, 0, v7, vcc
+; GFX9-NEXT:    v_mov_b32_e32 v1, v10
+; GFX9-NEXT:    v_mov_b32_e32 v2, v8
+; GFX9-NEXT:    v_mov_b32_e32 v3, v9
+; GFX9-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <4 x ptr addrspace(5)> %ptr to <4 x ptr>
+  ret <4 x ptr> %cast
+}
+
+define <8 x ptr> @addrspacecast_v8p5_to_v8p0(<8 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v8p5_to_v8p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v15, s4
+; CI-NEXT:    v_cndmask_b32_e32 v22, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v16, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v17, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; CI-NEXT:    v_cndmask_b32_e32 v18, 0, v2, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v19, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; CI-NEXT:    v_cndmask_b32_e32 v20, 0, v3, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v21, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v4
+; CI-NEXT:    v_cndmask_b32_e32 v8, 0, v4, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v9, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v5
+; CI-NEXT:    v_cndmask_b32_e32 v10, 0, v5, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v11, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v6
+; CI-NEXT:    v_cndmask_b32_e32 v12, 0, v6, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v13, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v7
+; CI-NEXT:    v_cndmask_b32_e32 v14, 0, v7, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v15, 0, v15, vcc
+; CI-NEXT:    v_mov_b32_e32 v1, v22
+; CI-NEXT:    v_mov_b32_e32 v2, v16
+; CI-NEXT:    v_mov_b32_e32 v3, v17
+; CI-NEXT:    v_mov_b32_e32 v4, v18
+; CI-NEXT:    v_mov_b32_e32 v5, v19
+; CI-NEXT:    v_mov_b32_e32 v6, v20
+; CI-NEXT:    v_mov_b32_e32 v7, v21
+; CI-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: addrspacecast_v8p5_to_v8p0:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    s_mov_b64 s[4:5], src_private_base
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; GFX9-NEXT:    v_mov_b32_e32 v15, s5
+; GFX9-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v22, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; GFX9-NEXT:    v_cndmask_b32_e32 v16, 0, v1, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v17, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; GFX9-NEXT:    v_cndmask_b32_e32 v18, 0, v2, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v19, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; GFX9-NEXT:    v_cndmask_b32_e32 v20, 0, v3, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v21, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v4
+; GFX9-NEXT:    v_cndmask_b32_e32 v8, 0, v4, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v9, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v5
+; GFX9-NEXT:    v_cndmask_b32_e32 v10, 0, v5, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v11, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v6
+; GFX9-NEXT:    v_cndmask_b32_e32 v12, 0, v6, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v13, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v7
+; GFX9-NEXT:    v_cndmask_b32_e32 v14, 0, v7, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v15, 0, v15, vcc
+; GFX9-NEXT:    v_mov_b32_e32 v1, v22
+; GFX9-NEXT:    v_mov_b32_e32 v2, v16
+; GFX9-NEXT:    v_mov_b32_e32 v3, v17
+; GFX9-NEXT:    v_mov_b32_e32 v4, v18
+; GFX9-NEXT:    v_mov_b32_e32 v5, v19
+; GFX9-NEXT:    v_mov_b32_e32 v6, v20
+; GFX9-NEXT:    v_mov_b32_e32 v7, v21
+; GFX9-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <8 x ptr addrspace(5)> %ptr to <8 x ptr>
+  ret <8 x ptr> %cast
+}
+
+define <16 x ptr> @addrspacecast_v16p5_to_v16p0(<16 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v16p5_to_v16p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    v_cmp_ne_u32_e64 s[6:7], -1, v6
+; CI-NEXT:    v_cmp_ne_u32_e64 s[8:9], -1, v7
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v31, s4
+; CI-NEXT:    v_cndmask_b32_e32 v48, 0, v31, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v35, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v33, 0, v31, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; CI-NEXT:    v_cndmask_b32_e32 v36, 0, v2, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v49, 0, v31, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; CI-NEXT:    v_cndmask_b32_e32 v37, 0, v3, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v34, 0, v31, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v4
+; CI-NEXT:    v_cmp_ne_u32_e64 s[4:5], -1, v5
+; CI-NEXT:    v_cndmask_b32_e32 v38, 0, v4, vcc
+; CI-NEXT:    v_cndmask_b32_e64 v50, 0, v5, s[4:5]
+; CI-NEXT:    v_cndmask_b32_e64 v39, 0, v6, s[6:7]
+; CI-NEXT:    v_cndmask_b32_e64 v32, 0, v7, s[8:9]
+; CI-NEXT:    v_cmp_ne_u32_e64 s[10:11], -1, v8
+; CI-NEXT:    v_cmp_ne_u32_e64 s[12:13], -1, v9
+; CI-NEXT:    v_cmp_ne_u32_e64 s[14:15], -1, v10
+; CI-NEXT:    v_cmp_ne_u32_e64 s[16:17], -1, v11
+; CI-NEXT:    v_cmp_ne_u32_e64 s[18:19], -1, v12
+; CI-NEXT:    v_cmp_ne_u32_e64 s[20:21], -1, v13
+; CI-NEXT:    v_cmp_ne_u32_e64 s[22:23], -1, v14
+; CI-NEXT:    v_cmp_ne_u32_e64 s[24:25], -1, v15
+; CI-NEXT:    v_cndmask_b32_e64 v16, 0, v8, s[10:11]
+; CI-NEXT:    v_cndmask_b32_e64 v18, 0, v9, s[12:13]
+; CI-NEXT:    v_cndmask_b32_e64 v20, 0, v10, s[14:15]
+; CI-NEXT:    v_cndmask_b32_e64 v22, 0, v11, s[16:17]
+; CI-NEXT:    v_cndmask_b32_e64 v24, 0, v12, s[18:19]
+; CI-NEXT:    v_cndmask_b32_e64 v26, 0, v13, s[20:21]
+; CI-NEXT:    v_cndmask_b32_e64 v28, 0, v14, s[22:23]
+; CI-NEXT:    v_cndmask_b32_e64 v30, 0, v15, s[24:25]
+; CI-NEXT:    v_cndmask_b32_e32 v9, 0, v31, vcc
+; CI-NEXT:    v_cndmask_b32_e64 v11, 0, v31, s[4:5]
+; CI-NEXT:    v_cndmask_b32_e64 v13, 0, v31, s[6:7]
+; CI-NEXT:    v_cndmask_b32_e64 v15, 0, v31, s[8:9]
+; CI-NEXT:    v_cndmask_b32_e64 v17, 0, v31, s[10:11]
+; CI-NEXT:    v_cndmask_b32_e64 v19, 0, v31, s[12:13]
+; CI-NEXT:    v_cndmask_b32_e64 v21, 0, v31, s[14:15]
+; CI-NEXT:    v_cndmask_b32_e64 v23, 0, v31, s[16:17]
+; CI-NEXT:    v_cndmask_b32_e64 v25, 0, v31, s[18:19]
+; CI-NEXT:    v_cndmask_b32_e64 v27, 0, v31, s[20:21]
+; CI-NEXT:   ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 28, 2024

@llvm/pr-subscribers-llvm-selectiondag

Author: Matt Arsenault (arsenm)

Changes

Patch is 53.57 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113964.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+3)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+8)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+12-12)
  • (modified) llvm/test/CodeGen/AMDGPU/addrspacecast.ll (+1177)
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index e0a03383358b76..ab12a9222fa6d4 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -4364,6 +4364,9 @@ bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
     Results.push_back(DAG.getNode(ISD::FP_TO_SINT, dl, ResVT, RoundNode));
     break;
   }
+  case ISD::ADDRSPACECAST:
+    Results.push_back(DAG.UnrollVectorOp(Node));
+    break;
   case ISD::GLOBAL_OFFSET_TABLE:
   case ISD::GlobalAddress:
   case ISD::GlobalTLSAddress:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 1a86b3b51234d1..c1b55800a1c7a7 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -12583,6 +12583,14 @@ SDValue SelectionDAG::UnrollVectorOp(SDNode *N, unsigned ResNE) {
       Scalars.push_back(getNode(N->getOpcode(), dl, EltVT,
                                 Operands[0],
                                 getValueType(ExtVT)));
+      break;
+    }
+    case ISD::ADDRSPACECAST: {
+      const auto *ASC = cast<AddrSpaceCastSDNode>(N);
+      Scalars.push_back(getAddrSpaceCast(dl, EltVT, Operands[0],
+                                         ASC->getSrcAddressSpace(),
+                                         ASC->getDestAddressSpace()));
+      break;
     }
     }
   }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 0f65df0763cc83..e4b54c7d72b083 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -512,18 +512,18 @@ AMDGPUTargetLowering::AMDGPUTargetLowering(const TargetMachine &TM,
 
   for (MVT VT : VectorIntTypes) {
     // Expand the following operations for the current type by default.
-    setOperationAction({ISD::ADD,        ISD::AND,     ISD::FP_TO_SINT,
-                        ISD::FP_TO_UINT, ISD::MUL,     ISD::MULHU,
-                        ISD::MULHS,      ISD::OR,      ISD::SHL,
-                        ISD::SRA,        ISD::SRL,     ISD::ROTL,
-                        ISD::ROTR,       ISD::SUB,     ISD::SINT_TO_FP,
-                        ISD::UINT_TO_FP, ISD::SDIV,    ISD::UDIV,
-                        ISD::SREM,       ISD::UREM,    ISD::SMUL_LOHI,
-                        ISD::UMUL_LOHI,  ISD::SDIVREM, ISD::UDIVREM,
-                        ISD::SELECT,     ISD::VSELECT, ISD::SELECT_CC,
-                        ISD::XOR,        ISD::BSWAP,   ISD::CTPOP,
-                        ISD::CTTZ,       ISD::CTLZ,    ISD::VECTOR_SHUFFLE,
-                        ISD::SETCC},
+    setOperationAction({ISD::ADD,        ISD::AND,          ISD::FP_TO_SINT,
+                        ISD::FP_TO_UINT, ISD::MUL,          ISD::MULHU,
+                        ISD::MULHS,      ISD::OR,           ISD::SHL,
+                        ISD::SRA,        ISD::SRL,          ISD::ROTL,
+                        ISD::ROTR,       ISD::SUB,          ISD::SINT_TO_FP,
+                        ISD::UINT_TO_FP, ISD::SDIV,         ISD::UDIV,
+                        ISD::SREM,       ISD::UREM,         ISD::SMUL_LOHI,
+                        ISD::UMUL_LOHI,  ISD::SDIVREM,      ISD::UDIVREM,
+                        ISD::SELECT,     ISD::VSELECT,      ISD::SELECT_CC,
+                        ISD::XOR,        ISD::BSWAP,        ISD::CTPOP,
+                        ISD::CTTZ,       ISD::CTLZ,         ISD::VECTOR_SHUFFLE,
+                        ISD::SETCC,      ISD::ADDRSPACECAST},
                        VT, Expand);
   }
 
diff --git a/llvm/test/CodeGen/AMDGPU/addrspacecast.ll b/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
index 7336543b41cbc8..236956c1829e77 100644
--- a/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
+++ b/llvm/test/CodeGen/AMDGPU/addrspacecast.ll
@@ -409,6 +409,1183 @@ define amdgpu_kernel void @use_constant32bit_to_flat_addrspacecast_1(ptr addrspa
   ret void
 }
 
+define <2 x ptr addrspace(5)> @addrspacecast_v2p0_to_v2p5(<2 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v2p0_to_v2p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <2 x ptr> %ptr to <2 x ptr addrspace(5)>
+  ret <2 x ptr addrspace(5)> %cast
+}
+
+define <3 x ptr addrspace(5)> @addrspacecast_v3p0_to_v3p5(<3 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v3p0_to_v3p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[4:5]
+; HSA-NEXT:    v_cndmask_b32_e32 v2, -1, v4, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <3 x ptr> %ptr to <3 x ptr addrspace(5)>
+  ret <3 x ptr addrspace(5)> %cast
+}
+
+define <4 x ptr addrspace(5)> @addrspacecast_v4p0_to_v4p5(<4 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v4p0_to_v4p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[4:5]
+; HSA-NEXT:    v_cndmask_b32_e32 v2, -1, v4, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[6:7]
+; HSA-NEXT:    v_cndmask_b32_e32 v3, -1, v6, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <4 x ptr> %ptr to <4 x ptr addrspace(5)>
+  ret <4 x ptr addrspace(5)> %cast
+}
+
+define <8 x ptr addrspace(5)> @addrspacecast_v8p0_to_v8p5(<8 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v8p0_to_v8p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[4:5]
+; HSA-NEXT:    v_cndmask_b32_e32 v2, -1, v4, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[6:7]
+; HSA-NEXT:    v_cndmask_b32_e32 v3, -1, v6, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[8:9]
+; HSA-NEXT:    v_cndmask_b32_e32 v4, -1, v8, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[10:11]
+; HSA-NEXT:    v_cndmask_b32_e32 v5, -1, v10, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[12:13]
+; HSA-NEXT:    v_cndmask_b32_e32 v6, -1, v12, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[14:15]
+; HSA-NEXT:    v_cndmask_b32_e32 v7, -1, v14, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <8 x ptr> %ptr to <8 x ptr addrspace(5)>
+  ret <8 x ptr addrspace(5)> %cast
+}
+
+define <16 x ptr addrspace(5)> @addrspacecast_v16p0_to_v16p5(<16 x ptr> %ptr) {
+; HSA-LABEL: addrspacecast_v16p0_to_v16p5:
+; HSA:       ; %bb.0:
+; HSA-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; HSA-NEXT:    buffer_load_dword v31, off, s[0:3], s32
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[0:1]
+; HSA-NEXT:    v_cmp_ne_u64_e64 s[4:5], 0, v[24:25]
+; HSA-NEXT:    v_cndmask_b32_e32 v0, -1, v0, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[2:3]
+; HSA-NEXT:    v_cmp_ne_u64_e64 s[6:7], 0, v[26:27]
+; HSA-NEXT:    v_cndmask_b32_e32 v1, -1, v2, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[4:5]
+; HSA-NEXT:    v_cmp_ne_u64_e64 s[8:9], 0, v[28:29]
+; HSA-NEXT:    v_cndmask_b32_e32 v2, -1, v4, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[6:7]
+; HSA-NEXT:    v_cndmask_b32_e32 v3, -1, v6, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[8:9]
+; HSA-NEXT:    v_cndmask_b32_e32 v4, -1, v8, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[10:11]
+; HSA-NEXT:    v_cndmask_b32_e32 v5, -1, v10, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[12:13]
+; HSA-NEXT:    v_cndmask_b32_e64 v13, -1, v26, s[6:7]
+; HSA-NEXT:    v_cndmask_b32_e32 v6, -1, v12, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[14:15]
+; HSA-NEXT:    v_cndmask_b32_e64 v12, -1, v24, s[4:5]
+; HSA-NEXT:    v_cndmask_b32_e32 v7, -1, v14, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[16:17]
+; HSA-NEXT:    v_cndmask_b32_e64 v14, -1, v28, s[8:9]
+; HSA-NEXT:    v_cndmask_b32_e32 v8, -1, v16, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[18:19]
+; HSA-NEXT:    v_cndmask_b32_e32 v9, -1, v18, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[20:21]
+; HSA-NEXT:    v_cndmask_b32_e32 v10, -1, v20, vcc
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[22:23]
+; HSA-NEXT:    v_cndmask_b32_e32 v11, -1, v22, vcc
+; HSA-NEXT:    s_waitcnt vmcnt(0)
+; HSA-NEXT:    v_cmp_ne_u64_e32 vcc, 0, v[30:31]
+; HSA-NEXT:    v_cndmask_b32_e32 v15, -1, v30, vcc
+; HSA-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <16 x ptr> %ptr to <16 x ptr addrspace(5)>
+  ret <16 x ptr addrspace(5)> %cast
+}
+
+define <2 x ptr> @addrspacecast_v2p5_to_v2p0(<2 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v2p5_to_v2p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v3, s4
+; CI-NEXT:    v_cndmask_b32_e32 v4, 0, v3, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v2, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v3, 0, v3, vcc
+; CI-NEXT:    v_mov_b32_e32 v1, v4
+; CI-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: addrspacecast_v2p5_to_v2p0:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    s_mov_b64 s[4:5], src_private_base
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; GFX9-NEXT:    v_mov_b32_e32 v3, s5
+; GFX9-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v4, 0, v3, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; GFX9-NEXT:    v_cndmask_b32_e32 v2, 0, v1, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v3, 0, v3, vcc
+; GFX9-NEXT:    v_mov_b32_e32 v1, v4
+; GFX9-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <2 x ptr addrspace(5)> %ptr to <2 x ptr>
+  ret <2 x ptr> %cast
+}
+
+define <3 x ptr> @addrspacecast_v3p5_to_v3p0(<3 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v3p5_to_v3p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v5, s4
+; CI-NEXT:    v_cndmask_b32_e32 v7, 0, v5, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v6, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v3, 0, v5, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; CI-NEXT:    v_cndmask_b32_e32 v4, 0, v2, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v5, 0, v5, vcc
+; CI-NEXT:    v_mov_b32_e32 v1, v7
+; CI-NEXT:    v_mov_b32_e32 v2, v6
+; CI-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: addrspacecast_v3p5_to_v3p0:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    s_mov_b64 s[4:5], src_private_base
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; GFX9-NEXT:    v_mov_b32_e32 v5, s5
+; GFX9-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v7, 0, v5, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; GFX9-NEXT:    v_cndmask_b32_e32 v6, 0, v1, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v3, 0, v5, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; GFX9-NEXT:    v_cndmask_b32_e32 v4, 0, v2, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v5, 0, v5, vcc
+; GFX9-NEXT:    v_mov_b32_e32 v1, v7
+; GFX9-NEXT:    v_mov_b32_e32 v2, v6
+; GFX9-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <3 x ptr addrspace(5)> %ptr to <3 x ptr>
+  ret <3 x ptr> %cast
+}
+
+define <4 x ptr> @addrspacecast_v4p5_to_v4p0(<4 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v4p5_to_v4p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v7, s4
+; CI-NEXT:    v_cndmask_b32_e32 v10, 0, v7, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v8, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v9, 0, v7, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; CI-NEXT:    v_cndmask_b32_e32 v4, 0, v2, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v5, 0, v7, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; CI-NEXT:    v_cndmask_b32_e32 v6, 0, v3, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v7, 0, v7, vcc
+; CI-NEXT:    v_mov_b32_e32 v1, v10
+; CI-NEXT:    v_mov_b32_e32 v2, v8
+; CI-NEXT:    v_mov_b32_e32 v3, v9
+; CI-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: addrspacecast_v4p5_to_v4p0:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    s_mov_b64 s[4:5], src_private_base
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; GFX9-NEXT:    v_mov_b32_e32 v7, s5
+; GFX9-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v10, 0, v7, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; GFX9-NEXT:    v_cndmask_b32_e32 v8, 0, v1, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v9, 0, v7, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; GFX9-NEXT:    v_cndmask_b32_e32 v4, 0, v2, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v5, 0, v7, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; GFX9-NEXT:    v_cndmask_b32_e32 v6, 0, v3, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v7, 0, v7, vcc
+; GFX9-NEXT:    v_mov_b32_e32 v1, v10
+; GFX9-NEXT:    v_mov_b32_e32 v2, v8
+; GFX9-NEXT:    v_mov_b32_e32 v3, v9
+; GFX9-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <4 x ptr addrspace(5)> %ptr to <4 x ptr>
+  ret <4 x ptr> %cast
+}
+
+define <8 x ptr> @addrspacecast_v8p5_to_v8p0(<8 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v8p5_to_v8p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v15, s4
+; CI-NEXT:    v_cndmask_b32_e32 v22, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v16, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v17, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; CI-NEXT:    v_cndmask_b32_e32 v18, 0, v2, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v19, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; CI-NEXT:    v_cndmask_b32_e32 v20, 0, v3, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v21, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v4
+; CI-NEXT:    v_cndmask_b32_e32 v8, 0, v4, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v9, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v5
+; CI-NEXT:    v_cndmask_b32_e32 v10, 0, v5, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v11, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v6
+; CI-NEXT:    v_cndmask_b32_e32 v12, 0, v6, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v13, 0, v15, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v7
+; CI-NEXT:    v_cndmask_b32_e32 v14, 0, v7, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v15, 0, v15, vcc
+; CI-NEXT:    v_mov_b32_e32 v1, v22
+; CI-NEXT:    v_mov_b32_e32 v2, v16
+; CI-NEXT:    v_mov_b32_e32 v3, v17
+; CI-NEXT:    v_mov_b32_e32 v4, v18
+; CI-NEXT:    v_mov_b32_e32 v5, v19
+; CI-NEXT:    v_mov_b32_e32 v6, v20
+; CI-NEXT:    v_mov_b32_e32 v7, v21
+; CI-NEXT:    s_setpc_b64 s[30:31]
+;
+; GFX9-LABEL: addrspacecast_v8p5_to_v8p0:
+; GFX9:       ; %bb.0:
+; GFX9-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX9-NEXT:    s_mov_b64 s[4:5], src_private_base
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; GFX9-NEXT:    v_mov_b32_e32 v15, s5
+; GFX9-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v22, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; GFX9-NEXT:    v_cndmask_b32_e32 v16, 0, v1, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v17, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; GFX9-NEXT:    v_cndmask_b32_e32 v18, 0, v2, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v19, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; GFX9-NEXT:    v_cndmask_b32_e32 v20, 0, v3, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v21, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v4
+; GFX9-NEXT:    v_cndmask_b32_e32 v8, 0, v4, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v9, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v5
+; GFX9-NEXT:    v_cndmask_b32_e32 v10, 0, v5, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v11, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v6
+; GFX9-NEXT:    v_cndmask_b32_e32 v12, 0, v6, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v13, 0, v15, vcc
+; GFX9-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v7
+; GFX9-NEXT:    v_cndmask_b32_e32 v14, 0, v7, vcc
+; GFX9-NEXT:    v_cndmask_b32_e32 v15, 0, v15, vcc
+; GFX9-NEXT:    v_mov_b32_e32 v1, v22
+; GFX9-NEXT:    v_mov_b32_e32 v2, v16
+; GFX9-NEXT:    v_mov_b32_e32 v3, v17
+; GFX9-NEXT:    v_mov_b32_e32 v4, v18
+; GFX9-NEXT:    v_mov_b32_e32 v5, v19
+; GFX9-NEXT:    v_mov_b32_e32 v6, v20
+; GFX9-NEXT:    v_mov_b32_e32 v7, v21
+; GFX9-NEXT:    s_setpc_b64 s[30:31]
+  %cast = addrspacecast <8 x ptr addrspace(5)> %ptr to <8 x ptr>
+  ret <8 x ptr> %cast
+}
+
+define <16 x ptr> @addrspacecast_v16p5_to_v16p0(<16 x ptr addrspace(5)> %ptr) {
+; CI-LABEL: addrspacecast_v16p5_to_v16p0:
+; CI:       ; %bb.0:
+; CI-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; CI-NEXT:    s_load_dword s4, s[6:7], 0x11
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v0
+; CI-NEXT:    v_cndmask_b32_e32 v0, 0, v0, vcc
+; CI-NEXT:    v_cmp_ne_u32_e64 s[6:7], -1, v6
+; CI-NEXT:    v_cmp_ne_u32_e64 s[8:9], -1, v7
+; CI-NEXT:    s_waitcnt lgkmcnt(0)
+; CI-NEXT:    v_mov_b32_e32 v31, s4
+; CI-NEXT:    v_cndmask_b32_e32 v48, 0, v31, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v1
+; CI-NEXT:    v_cndmask_b32_e32 v35, 0, v1, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v33, 0, v31, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v2
+; CI-NEXT:    v_cndmask_b32_e32 v36, 0, v2, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v49, 0, v31, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v3
+; CI-NEXT:    v_cndmask_b32_e32 v37, 0, v3, vcc
+; CI-NEXT:    v_cndmask_b32_e32 v34, 0, v31, vcc
+; CI-NEXT:    v_cmp_ne_u32_e32 vcc, -1, v4
+; CI-NEXT:    v_cmp_ne_u32_e64 s[4:5], -1, v5
+; CI-NEXT:    v_cndmask_b32_e32 v38, 0, v4, vcc
+; CI-NEXT:    v_cndmask_b32_e64 v50, 0, v5, s[4:5]
+; CI-NEXT:    v_cndmask_b32_e64 v39, 0, v6, s[6:7]
+; CI-NEXT:    v_cndmask_b32_e64 v32, 0, v7, s[8:9]
+; CI-NEXT:    v_cmp_ne_u32_e64 s[10:11], -1, v8
+; CI-NEXT:    v_cmp_ne_u32_e64 s[12:13], -1, v9
+; CI-NEXT:    v_cmp_ne_u32_e64 s[14:15], -1, v10
+; CI-NEXT:    v_cmp_ne_u32_e64 s[16:17], -1, v11
+; CI-NEXT:    v_cmp_ne_u32_e64 s[18:19], -1, v12
+; CI-NEXT:    v_cmp_ne_u32_e64 s[20:21], -1, v13
+; CI-NEXT:    v_cmp_ne_u32_e64 s[22:23], -1, v14
+; CI-NEXT:    v_cmp_ne_u32_e64 s[24:25], -1, v15
+; CI-NEXT:    v_cndmask_b32_e64 v16, 0, v8, s[10:11]
+; CI-NEXT:    v_cndmask_b32_e64 v18, 0, v9, s[12:13]
+; CI-NEXT:    v_cndmask_b32_e64 v20, 0, v10, s[14:15]
+; CI-NEXT:    v_cndmask_b32_e64 v22, 0, v11, s[16:17]
+; CI-NEXT:    v_cndmask_b32_e64 v24, 0, v12, s[18:19]
+; CI-NEXT:    v_cndmask_b32_e64 v26, 0, v13, s[20:21]
+; CI-NEXT:    v_cndmask_b32_e64 v28, 0, v14, s[22:23]
+; CI-NEXT:    v_cndmask_b32_e64 v30, 0, v15, s[24:25]
+; CI-NEXT:    v_cndmask_b32_e32 v9, 0, v31, vcc
+; CI-NEXT:    v_cndmask_b32_e64 v11, 0, v31, s[4:5]
+; CI-NEXT:    v_cndmask_b32_e64 v13, 0, v31, s[6:7]
+; CI-NEXT:    v_cndmask_b32_e64 v15, 0, v31, s[8:9]
+; CI-NEXT:    v_cndmask_b32_e64 v17, 0, v31, s[10:11]
+; CI-NEXT:    v_cndmask_b32_e64 v19, 0, v31, s[12:13]
+; CI-NEXT:    v_cndmask_b32_e64 v21, 0, v31, s[14:15]
+; CI-NEXT:    v_cndmask_b32_e64 v23, 0, v31, s[16:17]
+; CI-NEXT:    v_cndmask_b32_e64 v25, 0, v31, s[18:19]
+; CI-NEXT:    v_cndmask_b32_e64 v27, 0, v31, s[20:21]
+; CI-NEXT:   ...
[truncated]

@arsenm arsenm marked this pull request as ready for review October 28, 2024 21:38
Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, thanks!

@jhuber6 jhuber6 merged commit 88e23eb into main Oct 29, 2024
13 checks passed
@jhuber6 jhuber6 deleted the users/arsenm/amdgpu-dag-fix-vector-addrspacecast branch October 29, 2024 13:08
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants