[NVPTX] Update setmaxnreg intrinsic lowering #125846

vvchernov · 2025-02-05T12:31:29Z

The setmaxnreg PTX instruction is supported on all arch-conditionals, known up-to cuda-12.8, from sm90 onwards. This patch
updates the predicate checks to handle this. The feature is additionally tested in setmaxnreg-sm100a.ll

vvchernov · 2025-02-05T12:32:59Z

cc @Artem-B @durga4github

llvmbot · 2025-02-05T12:35:40Z

@llvm/pr-subscribers-backend-nvptx

Author: Valery Chernov (vvchernov)

Changes

The setmaxnreg PTX instruction is supported on all arch-conditionals from sm90 onwards. This patch
updates the predicate checks to handle this. The feature is additionally tested in setmaxnreg-sm100a.ll

Full diff: https://github.com/llvm/llvm-project/pull/125846.diff

3 Files Affected:

(modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.td (+1)
(modified) llvm/lib/Target/NVPTX/NVPTXIntrinsics.td (+1-1)
(added) llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll (+13)

diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index 74def43d825665..ee033e802560ff 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -142,6 +142,7 @@ def hasLDU : Predicate<"Subtarget->hasLDU()">;
 def hasPTXASUnreachableBug : Predicate<"Subtarget->hasPTXASUnreachableBug()">;
 def noPTXASUnreachableBug : Predicate<"!Subtarget->hasPTXASUnreachableBug()">;
 def hasOptEnabled : Predicate<"TM.getOptLevel() != CodeGenOptLevel::None">;
+def hasAAFeatures : Predicate<"Subtarget->hasAAFeatures()">;
 
 def doF32FTZ : Predicate<"useF32FTZ()">;
 def doNoF32FTZ : Predicate<"!useF32FTZ()">;
diff --git a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
index a0d00e4aac560a..f1d0b72e4427da 100644
--- a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+++ b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -7547,7 +7547,7 @@ multiclass SET_MAXNREG<string Action, Intrinsic Intr> {
   def : NVPTXInst<(outs), (ins i32imm:$reg_count),
           "setmaxnreg." # Action # ".sync.aligned.u32 $reg_count;",
           [(Intr timm:$reg_count)]>,
-    Requires<[hasSM90a, hasPTX<80>]>;
+    Requires<[hasAAFeatures, hasSM<90>, hasPTX<80>]>;
 }
 
 defm INT_SET_MAXNREG_INC : SET_MAXNREG<"inc", int_nvvm_setmaxnreg_inc_sync_aligned_u32>;
diff --git a/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll b/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll
new file mode 100644
index 00000000000000..6ee9383f5300f8
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll
@@ -0,0 +1,13 @@
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_100a -mattr=+ptx86 | FileCheck --check-prefixes=CHECK %s
+; RUN: %if ptxas-12.6 %{ llc < %s -march=nvptx64 -mcpu=sm_100a -mattr=+ptx86 | %ptxas-verify -arch=sm_100a %}
+
+; CHECK-LABEL: test_set_maxn_reg_sm100a
+define void @test_set_maxn_reg_sm100a() {
+  ; CHECK: setmaxnreg.inc.sync.aligned.u32 96;
+  call void @llvm.nvvm.setmaxnreg.inc.sync.aligned.u32(i32 96)
+
+  ; CHECK: setmaxnreg.dec.sync.aligned.u32 64;
+  call void @llvm.nvvm.setmaxnreg.dec.sync.aligned.u32(i32 64)
+
+  ret void
+}
\ No newline at end of file

llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll

durga4github

Thanks for the fix.

LGTM except for a minor ask in the test file

llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll

Artem-B

LGTM with a nit.

llvm/lib/Target/NVPTX/NVPTXInstrInfo.td

The setmaxnreg PTX instruction is supported on all arch-conditionals, known up-to cuda-12.8, from sm90 onwards. This patch updates the predicate checks to handle this. The feature is additionally tested in setmaxnreg-sm100a.ll

llvmbot added the backend:NVPTX label Feb 5, 2025

durga4github reviewed Feb 5, 2025

View reviewed changes

llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll Outdated Show resolved Hide resolved

durga4github reviewed Feb 5, 2025

View reviewed changes

vvchernov marked this pull request as draft February 5, 2025 14:25

vvchernov force-pushed the vc/setmaxnreg branch from d99976a to 460c22b Compare February 5, 2025 14:33

justinfargnoli reviewed Feb 5, 2025

View reviewed changes

llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll Outdated Show resolved Hide resolved

Artem-B approved these changes Feb 5, 2025

View reviewed changes

llvm/lib/Target/NVPTX/NVPTXInstrInfo.td Outdated Show resolved Hide resolved

Update setmaxnreg intrinsic lowering

a460eb7

vvchernov force-pushed the vc/setmaxnreg branch from 460c22b to a460eb7 Compare February 6, 2025 09:00

vvchernov marked this pull request as ready for review February 6, 2025 09:02

durga4github approved these changes Feb 6, 2025

View reviewed changes

LewisCrawford merged commit e225677 into llvm:main Feb 6, 2025
10 checks passed

vvchernov deleted the vc/setmaxnreg branch March 22, 2025 07:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[NVPTX] Update setmaxnreg intrinsic lowering #125846

[NVPTX] Update setmaxnreg intrinsic lowering #125846

Uh oh!

vvchernov commented Feb 5, 2025 •

edited by durga4github

Loading

Uh oh!

vvchernov commented Feb 5, 2025

Uh oh!

llvmbot commented Feb 5, 2025

Uh oh!

Uh oh!

durga4github left a comment

Uh oh!

Uh oh!

Artem-B left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[NVPTX] Update setmaxnreg intrinsic lowering #125846

[NVPTX] Update setmaxnreg intrinsic lowering #125846

Uh oh!

Conversation

vvchernov commented Feb 5, 2025 • edited by durga4github Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vvchernov commented Feb 5, 2025

Uh oh!

llvmbot commented Feb 5, 2025

Uh oh!

Uh oh!

durga4github left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Artem-B left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vvchernov commented Feb 5, 2025 •

edited by durga4github

Loading