-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[NVPTX] Update setmaxnreg intrinsic lowering #125846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-nvptx Author: Valery Chernov (vvchernov) ChangesThe setmaxnreg PTX instruction is supported on all arch-conditionals from sm90 onwards. This patch Full diff: https://github.com/llvm/llvm-project/pull/125846.diff 3 Files Affected:
diff --git a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
index 74def43d825665..ee033e802560ff 100644
--- a/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+++ b/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
@@ -142,6 +142,7 @@ def hasLDU : Predicate<"Subtarget->hasLDU()">;
def hasPTXASUnreachableBug : Predicate<"Subtarget->hasPTXASUnreachableBug()">;
def noPTXASUnreachableBug : Predicate<"!Subtarget->hasPTXASUnreachableBug()">;
def hasOptEnabled : Predicate<"TM.getOptLevel() != CodeGenOptLevel::None">;
+def hasAAFeatures : Predicate<"Subtarget->hasAAFeatures()">;
def doF32FTZ : Predicate<"useF32FTZ()">;
def doNoF32FTZ : Predicate<"!useF32FTZ()">;
diff --git a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
index a0d00e4aac560a..f1d0b72e4427da 100644
--- a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+++ b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -7547,7 +7547,7 @@ multiclass SET_MAXNREG<string Action, Intrinsic Intr> {
def : NVPTXInst<(outs), (ins i32imm:$reg_count),
"setmaxnreg." # Action # ".sync.aligned.u32 $reg_count;",
[(Intr timm:$reg_count)]>,
- Requires<[hasSM90a, hasPTX<80>]>;
+ Requires<[hasAAFeatures, hasSM<90>, hasPTX<80>]>;
}
defm INT_SET_MAXNREG_INC : SET_MAXNREG<"inc", int_nvvm_setmaxnreg_inc_sync_aligned_u32>;
diff --git a/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll b/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll
new file mode 100644
index 00000000000000..6ee9383f5300f8
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/setmaxnreg-sm100a.ll
@@ -0,0 +1,13 @@
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_100a -mattr=+ptx86 | FileCheck --check-prefixes=CHECK %s
+; RUN: %if ptxas-12.6 %{ llc < %s -march=nvptx64 -mcpu=sm_100a -mattr=+ptx86 | %ptxas-verify -arch=sm_100a %}
+
+; CHECK-LABEL: test_set_maxn_reg_sm100a
+define void @test_set_maxn_reg_sm100a() {
+ ; CHECK: setmaxnreg.inc.sync.aligned.u32 96;
+ call void @llvm.nvvm.setmaxnreg.inc.sync.aligned.u32(i32 96)
+
+ ; CHECK: setmaxnreg.dec.sync.aligned.u32 64;
+ call void @llvm.nvvm.setmaxnreg.dec.sync.aligned.u32(i32 64)
+
+ ret void
+}
\ No newline at end of file
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix.
LGTM except for a minor ask in the test file
d99976a
to
460c22b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a nit.
460c22b
to
a460eb7
Compare
The setmaxnreg PTX instruction is supported on all arch-conditionals, known up-to cuda-12.8, from sm90 onwards. This patch updates the predicate checks to handle this. The feature is additionally tested in setmaxnreg-sm100a.ll
The setmaxnreg PTX instruction is supported on all arch-conditionals, known up-to cuda-12.8, from sm90 onwards. This patch
updates the predicate checks to handle this. The feature is additionally tested in setmaxnreg-sm100a.ll