Skip to content

Commit 435609b

Browse files
[LLVM][NVPTX] Add support for griddepcontrol instruction (#123511)
This commit adds support for griddepcontrol PTX instruction with tests under griddepcontrol.ll
1 parent c9bc242 commit 435609b

File tree

4 files changed

+55
-0
lines changed

4 files changed

+55
-0
lines changed

llvm/docs/NVPTXUsage.rst

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -939,6 +939,29 @@ including that ``wgmma.mma_async`` instruction is undefined behavior.
939939
For more information, refer PTX ISA
940940
`<https://docs.nvidia.com/cuda/parallel-thread-execution/#asynchronous-warpgroup-level-matrix-instructions-wgmma-wait-group>`_.
941941

942+
'``llvm.nvvm.griddepcontrol.*``'
943+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
944+
945+
Syntax:
946+
"""""""
947+
948+
.. code-block:: llvm
949+
950+
declare void @llvm.nvvm.griddepcontrol.launch_dependents()
951+
declare void @llvm.nvvm.griddepcontrol.wait()
952+
953+
Overview:
954+
"""""""""
955+
956+
The ``griddepcontrol`` intrinsics allows the dependent grids and prerequisite grids as defined by the runtime, to control execution in the following way:
957+
958+
``griddepcontrol.launch_dependents`` intrinsic signals that the dependents can be scheduled, before the current grid completes. The intrinsic can be invoked by multiple threads in the current CTA and repeated invocations of the intrinsic will have no additional side effects past that of the first invocation.
959+
960+
``griddepcontrol.wait`` intrinsic causes the executing thread to wait until all prerequisite grids in flight have completed and all the memory operations from the prerequisite grids are performed and made visible to the current grid.
961+
962+
For more information, refer
963+
`PTX ISA <https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol>`__.
964+
942965
Other Intrinsics
943966
----------------
944967

llvm/include/llvm/IR/IntrinsicsNVVM.td

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5044,4 +5044,7 @@ def int_nvvm_cp_async_bulk_prefetch_L2
50445044
NoCapture<ArgIndex<0>>, ReadOnly<ArgIndex<0>>,
50455045
ImmArg<ArgIndex<3>>]>;
50465046

5047+
def int_nvvm_griddepcontrol_launch_dependents: Intrinsic<[], [], [IntrNoMem, IntrHasSideEffects]>;
5048+
def int_nvvm_griddepcontrol_wait: Intrinsic<[], [], [IntrNoMem, IntrHasSideEffects]>;
5049+
50475050
} // let TargetPrefix = "nvvm"

llvm/lib/Target/NVPTX/NVPTXIntrinsics.td

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7569,4 +7569,16 @@ def INT_NVVM_WGMMA_WAIT_GROUP_SYNC_ALIGNED : NVPTXInst<(outs), (ins i64imm:$n),
75697569
[(int_nvvm_wgmma_wait_group_sync_aligned timm:$n)]>, Requires<[hasSM90a, hasPTX<80>]>;
75707570
} // isConvergent = true
75717571

7572+
def GRIDDEPCONTROL_LAUNCH_DEPENDENTS :
7573+
NVPTXInst<(outs), (ins),
7574+
"griddepcontrol.launch_dependents;",
7575+
[(int_nvvm_griddepcontrol_launch_dependents)]>,
7576+
Requires<[hasSM<90>, hasPTX<78>]>;
7577+
7578+
def GRIDDEPCONTROL_WAIT :
7579+
NVPTXInst<(outs), (ins),
7580+
"griddepcontrol.wait;",
7581+
[(int_nvvm_griddepcontrol_wait)]>,
7582+
Requires<[hasSM<90>, hasPTX<78>]>;
7583+
75727584
def INT_EXIT : NVPTXInst<(outs), (ins), "exit;", [(int_nvvm_exit)]>;
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
2+
; RUN: llc < %s -mcpu=sm_90 -march=nvptx64 | FileCheck %s
3+
; RUN: %if ptxas-11.8 %{ llc < %s -mcpu=sm_90 -march=nvptx64 | %ptxas-verify %}
4+
5+
define void @griddepcontrol() {
6+
; CHECK-LABEL: griddepcontrol(
7+
; CHECK: {
8+
; CHECK-EMPTY:
9+
; CHECK-EMPTY:
10+
; CHECK-NEXT: // %bb.0:
11+
; CHECK-NEXT: griddepcontrol.launch_dependents;
12+
; CHECK-NEXT: griddepcontrol.wait;
13+
; CHECK-NEXT: ret;
14+
call void @llvm.nvvm.griddepcontrol.launch.dependents()
15+
call void @llvm.nvvm.griddepcontrol.wait()
16+
ret void
17+
}

0 commit comments

Comments
 (0)