-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[LLVM][NVPTX] Add support for griddepcontrol instruction #123511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVM][NVPTX] Add support for griddepcontrol instruction #123511
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-nvptx Author: Pradeep Kumar (schwarzschild-radius) ChangesThis commit adds support for griddepcontrol PTX instruction with tests under griddepcontrol.ll Full diff: https://github.com/llvm/llvm-project/pull/123511.diff 4 Files Affected:
diff --git a/llvm/docs/NVPTXUsage.rst b/llvm/docs/NVPTXUsage.rst
index 25a230f65fd3dd..2cfdba20e30c1b 100644
--- a/llvm/docs/NVPTXUsage.rst
+++ b/llvm/docs/NVPTXUsage.rst
@@ -911,6 +911,29 @@ including that ``wgmma.mma_async`` instruction is undefined behavior.
For more information, refer PTX ISA
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#asynchronous-warpgroup-level-matrix-instructions-wgmma-wait-group>`_.
+'``llvm.nvvm.griddepcontrol.*``'
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Syntax:
+"""""""
+
+.. code-block:: llvm
+
+ declare void @llvm.nvvm.griddepcontrol.launch_dependents()
+ declare void @llvm.nvvm.griddepcontrol.wait()
+
+Overview:
+"""""""""
+
+The ``griddepcontrol`` intrinsics allows the dependent grids and prerequisite grids as defined by the runtime, to control execution in the following way:
+
+``griddepcontrol.launch_dependents`` intrinsic signals that specific dependents the runtime system designated to react to this intrinsic can be scheduled as soon as all other CTAs in the grid issue the same intrinsic or have completed. The dependent may launch before the completion of the current grid. There is no guarantee that the dependent will launch before the completion of the current grid. Repeated invocations of this intrinsic by threads in the current CTA will have no additional side effects past that of the first invocation.
+
+``griddepcontrol.wait`` intrinsic causes the executing thread to wait until all prerequisite grids in flight have completed and all the memory operations from the prerequisite grids are performed and made visible to the current grid.
+
+For more information, refer
+`PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol>`__.
+
Other Intrinsics
----------------
diff --git a/llvm/include/llvm/IR/IntrinsicsNVVM.td b/llvm/include/llvm/IR/IntrinsicsNVVM.td
index 00a76018d8415d..b6f60563b763a8 100644
--- a/llvm/include/llvm/IR/IntrinsicsNVVM.td
+++ b/llvm/include/llvm/IR/IntrinsicsNVVM.td
@@ -5033,4 +5033,8 @@ def int_nvvm_cp_async_bulk_shared_cta_to_global
NoCapture<ArgIndex<0>>, NoCapture<ArgIndex<1>>,
ImmArg<ArgIndex<4>>]>;
+
+def int_nvvm_griddepcontrol_launch_dependents: DefaultAttrsIntrinsic<[], []>;
+def int_nvvm_griddepcontrol_wait: DefaultAttrsIntrinsic<[], []>;
+
} // let TargetPrefix = "nvvm"
diff --git a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
index 48d75728aef8e2..8552e2999be52f 100644
--- a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+++ b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -7557,4 +7557,16 @@ def INT_NVVM_WGMMA_WAIT_GROUP_SYNC_ALIGNED : NVPTXInst<(outs), (ins i64imm:$n),
[(int_nvvm_wgmma_wait_group_sync_aligned timm:$n)]>, Requires<[hasSM90a, hasPTX<80>]>;
} // isConvergent = true
+def GRIDDEPCONTROL_LAUNCH_DEPENDENTS :
+ NVPTXInst<(outs), (ins),
+ "griddepcontrol.launch_dependents;",
+ [(int_nvvm_griddepcontrol_launch_dependents)]>,
+ Requires<[hasSM<90>, hasPTX<78>]>;
+
+def GRIDDEPCONTROL_WAIT :
+ NVPTXInst<(outs), (ins),
+ "griddepcontrol.wait;",
+ [(int_nvvm_griddepcontrol_wait)]>,
+ Requires<[hasSM<90>, hasPTX<78>]>;
+
def INT_EXIT : NVPTXInst<(outs), (ins), "exit;", [(int_nvvm_exit)]>;
diff --git a/llvm/test/CodeGen/NVPTX/griddepcontrol.ll b/llvm/test/CodeGen/NVPTX/griddepcontrol.ll
new file mode 100644
index 00000000000000..fe15b3fe4afbd9
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/griddepcontrol.ll
@@ -0,0 +1,17 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -mcpu=sm_90 -march=nvptx64 | FileCheck %s
+; RUN: %if ptxas-11.8 %{ llc < %s -mcpu=sm_90 -march=nvptx64 | %ptxas-verify %}
+
+define void @griddepcontrol() {
+; CHECK-LABEL: griddepcontrol(
+; CHECK: {
+; CHECK-EMPTY:
+; CHECK-EMPTY:
+; CHECK-NEXT: // %bb.0:
+; CHECK-NEXT: griddepcontrol.launch_dependents;
+; CHECK-NEXT: griddepcontrol.wait;
+; CHECK-NEXT: ret;
+ call void @llvm.nvvm.griddepcontrol.launch.dependents()
+ call void @llvm.nvvm.griddepcontrol.wait()
+ ret void
+}
|
Changes look good to me, let us wait for Artem's review. |
4d71cbb
to
13b2b41
Compare
This commit adds support for griddepcontrol PTX instruction with tests under griddepcontrol.ll
13b2b41
to
036e5f5
Compare
@Artem-B Just a gentle ping on the review. Please let me know if the change is good to go |
This commit adds support for griddepcontrol PTX instruction with tests under griddepcontrol.ll