Skip to content

[NVPTX] Update setmaxnreg intrinsic lowering #125846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

vvchernov
Copy link
Contributor

@vvchernov vvchernov commented Feb 5, 2025

The setmaxnreg PTX instruction is supported on all arch-conditionals, known up-to cuda-12.8, from sm90 onwards. This patch
updates the predicate checks to handle this. The feature is additionally tested in setmaxnreg-sm100a.ll

@vvchernov
Copy link
Contributor Author

cc @Artem-B @durga4github

@llvmbot
Copy link
Member

llvmbot commented Feb 5, 2025

@llvm/pr-subscribers-backend-nvptx

Author: Valery Chernov (vvchernov)

Changes

The setmaxnreg PTX instruction is supported on all arch-conditionals from sm90 onwards. This patch
updates the predicate checks to handle this. The feature is additionally tested in setmaxnreg-sm100a.ll


Full diff: https://github.com/llvm/llvm-project/pull/125846.diff

3 Files Affected:

  • (modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.td (+1)
  • (modified) llvm/lib/Target/NVPTX/NVPTXIntrinsics.td (+1-1)
  • (added) llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll (+13)
diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index 74def43d825665..ee033e802560ff 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -142,6 +142,7 @@ def hasLDU : Predicate<"Subtarget->hasLDU()">;
 def hasPTXASUnreachableBug : Predicate<"Subtarget->hasPTXASUnreachableBug()">;
 def noPTXASUnreachableBug : Predicate<"!Subtarget->hasPTXASUnreachableBug()">;
 def hasOptEnabled : Predicate<"TM.getOptLevel() != CodeGenOptLevel::None">;
+def hasAAFeatures : Predicate<"Subtarget->hasAAFeatures()">;
 
 def doF32FTZ : Predicate<"useF32FTZ()">;
 def doNoF32FTZ : Predicate<"!useF32FTZ()">;
diff --git a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
index a0d00e4aac560a..f1d0b72e4427da 100644
--- a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+++ b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -7547,7 +7547,7 @@ multiclass SET_MAXNREG<string Action, Intrinsic Intr> {
   def : NVPTXInst<(outs), (ins i32imm:$reg_count),
           "setmaxnreg." # Action # ".sync.aligned.u32 $reg_count;",
           [(Intr timm:$reg_count)]>,
-    Requires<[hasSM90a, hasPTX<80>]>;
+    Requires<[hasAAFeatures, hasSM<90>, hasPTX<80>]>;
 }
 
 defm INT_SET_MAXNREG_INC : SET_MAXNREG<"inc", int_nvvm_setmaxnreg_inc_sync_aligned_u32>;
diff --git a/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll b/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll
new file mode 100644
index 00000000000000..6ee9383f5300f8
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll
@@ -0,0 +1,13 @@
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_100a -mattr=+ptx86 | FileCheck --check-prefixes=CHECK %s
+; RUN: %if ptxas-12.6 %{ llc < %s -march=nvptx64 -mcpu=sm_100a -mattr=+ptx86 | %ptxas-verify -arch=sm_100a %}
+
+; CHECK-LABEL: test_set_maxn_reg_sm100a
+define void @test_set_maxn_reg_sm100a() {
+  ; CHECK: setmaxnreg.inc.sync.aligned.u32 96;
+  call void @llvm.nvvm.setmaxnreg.inc.sync.aligned.u32(i32 96)
+
+  ; CHECK: setmaxnreg.dec.sync.aligned.u32 64;
+  call void @llvm.nvvm.setmaxnreg.dec.sync.aligned.u32(i32 64)
+
+  ret void
+}
\ No newline at end of file

Copy link
Contributor

@durga4github durga4github left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

LGTM except for a minor ask in the test file

@vvchernov vvchernov marked this pull request as draft February 5, 2025 14:25
Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nit.

@vvchernov vvchernov marked this pull request as ready for review February 6, 2025 09:02
@LewisCrawford LewisCrawford merged commit e225677 into llvm:main Feb 6, 2025
10 checks passed
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
The setmaxnreg PTX instruction is supported on all arch-conditionals,
known up-to cuda-12.8, from sm90 onwards. This patch
updates the predicate checks to handle this. The feature is additionally
tested in setmaxnreg-sm100a.ll
@vvchernov vvchernov deleted the vc/setmaxnreg branch March 22, 2025 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants