Skip to content

Commit 6e30d97

Browse files
authored
[MLIR][NVVM] [NFC] Update Docs for shfl.sync Op (#89044)
The first argument to the nvvm_shfl_sync_* family of intrinsics is the thread_mask (aka member_mask). This patch renames the corresponding operand in the Op to reflect the same i.e. `dst` -> `thread_mask`. While we are there, add summary and description for this Op. Signed-off-by: Durgadoss R <[email protected]>
1 parent 5d4e072 commit 6e30d97

File tree

1 file changed

+16
-3
lines changed

1 file changed

+16
-3
lines changed

mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -619,20 +619,33 @@ def ShflKindAttr : EnumAttr<NVVM_Dialect, ShflKind, "shfl_kind">;
619619
def NVVM_ShflOp :
620620
NVVM_Op<"shfl.sync">,
621621
Results<(outs LLVM_Type:$res)>,
622-
Arguments<(ins I32:$dst,
622+
Arguments<(ins I32:$thread_mask,
623623
LLVM_Type:$val,
624624
I32:$offset,
625625
I32:$mask_and_clamp,
626626
ShflKindAttr:$kind,
627627
OptionalAttr<UnitAttr>:$return_value_and_is_valid)> {
628+
let summary = "NVVM Dialect Op for shfl.sync";
629+
let description = [{
630+
The `shfl.sync` Op implements data shuffle within threads of a warp.
631+
The `thread_mask` denotes the threads participating in the Op where
632+
the bit position corresponds to a particular thread’s laneid.
633+
The `offset` specifies a source lane or source lane offset
634+
(depending on `kind`). The `val` is the input value to be copied from
635+
the source. The `mask_and_clamp` contains two packed values specifying
636+
a mask for logically splitting warps into sub-segments and an upper bound
637+
for clamping the source lane index.
638+
[For more information, refer PTX ISA]
639+
(https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-shfl-sync)
640+
}];
628641
string llvmBuilder = [{
629642
auto intId = getShflIntrinsicId(
630643
$_resultType, $kind, static_cast<bool>($return_value_and_is_valid));
631644
$res = createIntrinsicCall(builder,
632-
intId, {$dst, $val, $offset, $mask_and_clamp});
645+
intId, {$thread_mask, $val, $offset, $mask_and_clamp});
633646
}];
634647
let assemblyFormat = [{
635-
$kind $dst `,` $val `,` $offset `,` $mask_and_clamp attr-dict
648+
$kind $thread_mask `,` $val `,` $offset `,` $mask_and_clamp attr-dict
636649
`:` type($val) `->` type($res)
637650
}];
638651
let hasVerifier = 1;

0 commit comments

Comments
 (0)