-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[MLIR][NVVM] [NFC] Update Docs for shfl.sync Op #89044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR][NVVM] [NFC] Update Docs for shfl.sync Op #89044
Conversation
The first argument to the nvvm_shfl_sync_* family of intrinsics is the thread_mask (aka member_mask). This patch renames the corresponding operand in the Op to reflect the same i.e. `dst` -> `thread_mask`. While we are there, add summary and description for this Op. Signed-off-by: Durgadoss R <[email protected]>
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-llvm Author: Durgadoss R (durga4github) ChangesThe first argument to the nvvm_shfl_sync_* family While we are there, add summary and description Full diff: https://github.com/llvm/llvm-project/pull/89044.diff 1 Files Affected:
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index 728e92c9dc8dcf..f76b6d19b89552 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -619,20 +619,33 @@ def ShflKindAttr : EnumAttr<NVVM_Dialect, ShflKind, "shfl_kind">;
def NVVM_ShflOp :
NVVM_Op<"shfl.sync">,
Results<(outs LLVM_Type:$res)>,
- Arguments<(ins I32:$dst,
+ Arguments<(ins I32:$thread_mask,
LLVM_Type:$val,
I32:$offset,
I32:$mask_and_clamp,
ShflKindAttr:$kind,
OptionalAttr<UnitAttr>:$return_value_and_is_valid)> {
+ let summary = "NVVM Dialect Op for shfl.sync";
+ let description = [{
+ The `shfl.sync` Op implements data shuffle within threads of a warp.
+ The `thread_mask` denotes the threads participating in the Op where
+ the bit position corresponds to a particular thread’s laneid.
+ The `offset` specifies a source lane or source lane offset
+ (depending on `kind`). The `val` is the input value to be copied from
+ the source. The `mask_and_clamp` contains two packed values specifying
+ a mask for logically splitting warps into sub-segments and an upper bound
+ for clamping the source lane index.
+ [For more information, refer PTX ISA]
+ (https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-shfl-sync)
+ }];
string llvmBuilder = [{
auto intId = getShflIntrinsicId(
$_resultType, $kind, static_cast<bool>($return_value_and_is_valid));
$res = createIntrinsicCall(builder,
- intId, {$dst, $val, $offset, $mask_and_clamp});
+ intId, {$thread_mask, $val, $offset, $mask_and_clamp});
}];
let assemblyFormat = [{
- $kind $dst `,` $val `,` $offset `,` $mask_and_clamp attr-dict
+ $kind $thread_mask `,` $val `,` $offset `,` $mask_and_clamp attr-dict
`:` type($val) `->` type($res)
}];
let hasVerifier = 1;
|
@grypp , Please help with review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for nice documentation
(It's strange that why .github/CODEOWNERS did not catch this PR, I put another PR) |
@grypp , Thanks for the approval. Kindly help merge it. |
The first argument to the nvvm_shfl_sync_* family
of intrinsics is the thread_mask (aka member_mask).
This patch renames the corresponding operand in the Op
to reflect the same i.e.
dst
->thread_mask
.While we are there, add summary and description
for this Op.