[NVPTX] Support BFloat Store Parameter #137074

stumpOS · 2025-04-23T22:20:55Z

Before this patch, the instruction selector assumed that if the Memory Type is not {f16, v2f16, f32, f64} then the node type must be a ConstantSDNode when in fact if the memory type is bf16 then the node type is ConstantFPSDNode.

llvmbot · 2025-04-23T22:21:30Z

@llvm/pr-subscribers-backend-nvptx

Author: Steffi Stumpos (stumpOS)

Changes

Before this patch, the instruction selector assumed that if the Memory Type is not {f16, v2f16, f32, f64} then the node type must be a ConstantSDNode when in fact if the memory type is bf16 then the node type is ConstantFPSDNode.

Full diff: https://github.com/llvm/llvm-project/pull/137074.diff

2 Files Affected:

(modified) llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp (+4-4)
(modified) llvm/test/CodeGen/NVPTX/st-param-imm.ll (+24)

diff --git a/llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp b/llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
index ec1f969494cd1..e74c8828aaf1b 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
@@ -583,7 +583,7 @@ getOperationOrderings(MemSDNode *N, const NVPTXSubtarget *Subtarget) {
   // |------------------------------------------------------|-------------------------------|
   // | cuda::atomic_load                                    | fence.sc.<scope>;             |
   // |   (memory_order_seq_cst, cuda::thread_scope_<scope>) | ld.acquire.<scope>;           |
-  // |------------------------------------------------------|-------------------------------|  
+  // |------------------------------------------------------|-------------------------------|
   // | cuda::atomic_store                                   | fence.sc.<scope>;             |
   // |   (memory_order_seq_cst, cuda::thread_scope_<scope>) | st.release.<scope>;           |
   // |------------------------------------------------------|-------------------------------|
@@ -1852,7 +1852,7 @@ bool NVPTXDAGToDAGISel::tryStoreParam(SDNode *N) {
     case 1: {
       MVT::SimpleValueType MemTy = Mem->getMemoryVT().getSimpleVT().SimpleTy;
       SDValue Imm = Ops[0];
-      if (MemTy != MVT::f16 && MemTy != MVT::v2f16 &&
+      if (MemTy != MVT::f16 && MemTy != MVT::v2f16 && MemTy != MVT::bf16 &&
           (isa<ConstantSDNode>(Imm) || isa<ConstantFPSDNode>(Imm))) {
         // Convert immediate to target constant
         if (MemTy == MVT::f32 || MemTy == MVT::f64) {
@@ -2808,8 +2808,8 @@ void NVPTXDAGToDAGISel::SelectCpAsyncBulkPrefetchL2(SDNode *N) {
   SDLoc DL(N);
   SmallVector<SDValue, 4> Ops(N->ops().slice(2, NumArgs));
   Ops.push_back(N->getOperand(0)); // Chain operand
-  
-  unsigned Opcode = IsCacheHint 
+
+  unsigned Opcode = IsCacheHint
   ?  NVPTX::CP_ASYNC_BULK_PREFETCH_CH
   :  NVPTX::CP_ASYNC_BULK_PREFETCH;
   ReplaceNode(N, CurDAG->getMachineNode(Opcode, DL, N->getVTList(), Ops));
diff --git a/llvm/test/CodeGen/NVPTX/st-param-imm.ll b/llvm/test/CodeGen/NVPTX/st-param-imm.ll
index ab1447607ab65..d5463b04b3b72 100644
--- a/llvm/test/CodeGen/NVPTX/st-param-imm.ll
+++ b/llvm/test/CodeGen/NVPTX/st-param-imm.ll
@@ -2000,3 +2000,27 @@ declare void @call_v4_i8(%struct.char4 alignstack(4))
 declare void @call_v4_i16(%struct.short4 alignstack(8))
 declare void @call_v4_i32(%struct.int4 alignstack(16))
 declare void @call_v4_f32(%struct.float4 alignstack(16))
+
+define void @st_param_bfloat() {
+; CHECK-LABEL: st_param_bfloat(
+; CHECK: {
+; CHECK-NEXT:	.reg .b16 	%rs<2>;
+; CHECK-EMPTY:
+; CHECK-NEXT:// %bb.0:
+; CHECK-NEXT:	mov.b16 	%rs1, 0x4100;
+; CHECK-NEXT:	{ // callseq 83, 0
+; CHECK-NEXT:	.param .align 2 .b8 param0[2];
+; CHECK-NEXT:	st.param.b16 	[param0], %rs1;
+; CHECK-NEXT:	call.uni
+; CHECK-NEXT:	call_bfloat,
+; CHECK-NEXT:	(
+; CHECK-NEXT:	param0
+; CHECK-NEXT:	);
+; CHECK-NEXT:	} // callseq 83
+; CHECK-NEXT:	ret;
+  %five = bitcast i16 16640 to bfloat
+  call void @call_bfloat(bfloat %five)
+  ret void
+}
+
+declare void @call_bfloat(bfloat)

github-actions · 2025-04-23T22:23:13Z

✅ With the latest revision this PR passed the C/C++ code formatter.

justinfargnoli · 2025-04-24T14:21:44Z

llvm/test/CodeGen/NVPTX/st-param-imm.ll

+define void @st_param_v2bfloat(<2 x bfloat> %val) {
+; CHECK-LABEL: st_param_v2bfloat(
+; CHECK:	.param .align 4 .b8 st_param_v2bfloat_param_0[4]
+; CHECK-NEXT: )
+; CHECK-NEXT: {
+; CHECK-NEXT:		.reg .b32 	%r<2>;
+; CHECK-EMPTY:
+; CHECK-NEXT:	// %bb.0:
+; CHECK-NEXT:	ld.param.b32 	%r1, [st_param_v2bfloat_param_0];
+; CHECK-NEXT:	{ // callseq 84, 0
+; CHECK-NEXT:	.param .align 4 .b8 param0[4];
+; CHECK-NEXT:	st.param.b32 	[param0], %r1;
+; CHECK-NEXT:	call.uni
+; CHECK-NEXT:	call_v2bfloat,
+; CHECK-NEXT:	(
+; CHECK-NEXT:	param0
+; CHECK-NEXT:	);
+; CHECK-NEXT:	} // callseq 84
+; CHECK-NEXT:	ret;
+  call void @call_v2bfloat(<2 x bfloat> %val)
+  ret void
+}


It doesn't look like the output of this test has changed (source)

probably because ConstantFPSDNode do not have vector types; I will remove this test and the vector check (see Alex's comment)

llvm/test/CodeGen/NVPTX/st-param-imm.ll

AlexMaclean

Good catch! A little more cleanup can be performed here.

AlexMaclean · 2025-04-24T15:14:06Z

llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp

@@ -1852,7 +1852,8 @@ bool NVPTXDAGToDAGISel::tryStoreParam(SDNode *N) {
    case 1: {
      MVT::SimpleValueType MemTy = Mem->getMemoryVT().getSimpleVT().SimpleTy;
      SDValue Imm = Ops[0];
-      if (MemTy != MVT::f16 && MemTy != MVT::v2f16 &&
+      if (MemTy != MVT::f16 && MemTy != MVT::v2f16 && MemTy != MVT::bf16 &&


I think we can probably just remove the vector types here. A ConstantFPSDNode should never have one of these types.

ok, I will remove both vector type checks, thanks!

Can you elaborate why is it correct to remove them ? v2f16 and v2bf16 here are guarding handling of imms within this if-statement, so removing it will treat these constants as if they're ConstantInt, which should assert, right ?

The conditional also requires that the node type is either ConstantInt or ConstantFloat but when the vector type is used the node type is neither; I was unable to create a test that generates a constant node with vector type. In the test I added and then removed this code path was not hit because the node type was MemIntrinsicSDNode. This is why the test I added for vector types passed before my change (see Justin's comment above)

argh, yes. Somehow missed that.

llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp

Before this patch, the instruction selector assumed that if the Memory Type is not {f16, v2f16, f32, f64} then the node type must be a ConstantSDNode when in fact if the memory type is bf16 then the node type is ConstantFPSDNode.

llvmbot added the backend:NVPTX label Apr 23, 2025

npanchen requested review from Artem-B and AlexMaclean April 23, 2025 22:40

justinfargnoli reviewed Apr 24, 2025

View reviewed changes

llvm/test/CodeGen/NVPTX/st-param-imm.ll Show resolved Hide resolved

AlexMaclean reviewed Apr 24, 2025

View reviewed changes

AlexMaclean approved these changes Apr 24, 2025

View reviewed changes

stumpOS added 6 commits April 24, 2025 11:50

support bf16

c386cc0

add test

a5a3fe8

revert unintentional white space changes

d1e77e0

also guard against v2bf16

9c20a8f

format

33a1785

remove vector type from conditional

b85acb2

stumpOS force-pushed the stumpos/bf16Fix branch from 1a9f4c5 to b85acb2 Compare April 24, 2025 17:51

npanchen merged commit c007c46 into llvm:main Apr 24, 2025
11 checks passed

Artem-B reviewed Apr 24, 2025

View reviewed changes

llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NVPTX] Support BFloat Store Parameter #137074

[NVPTX] Support BFloat Store Parameter #137074

Uh oh!

stumpOS commented Apr 23, 2025

Uh oh!

llvmbot commented Apr 23, 2025

Uh oh!

github-actions bot commented Apr 23, 2025 •

edited

Loading

Uh oh!

justinfargnoli Apr 24, 2025

Uh oh!

stumpOS Apr 24, 2025

Uh oh!

Uh oh!

AlexMaclean left a comment

Uh oh!

AlexMaclean Apr 24, 2025

Uh oh!

stumpOS Apr 24, 2025

Uh oh!

npanchen Apr 24, 2025

Uh oh!

stumpOS Apr 24, 2025

Uh oh!

npanchen Apr 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[NVPTX] Support BFloat Store Parameter #137074

[NVPTX] Support BFloat Store Parameter #137074

Uh oh!

Conversation

stumpOS commented Apr 23, 2025

Uh oh!

llvmbot commented Apr 23, 2025

Uh oh!

github-actions bot commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justinfargnoli Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

stumpOS Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AlexMaclean left a comment

Choose a reason for hiding this comment

Uh oh!

AlexMaclean Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

stumpOS Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

npanchen Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

stumpOS Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

npanchen Apr 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 23, 2025 •

edited

Loading