-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[MLIR][GPU] Add gpu.cluster_dim_blocks and gpu.cluster_block_id Ops #95245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR][GPU] Add gpu.cluster_dim_blocks and gpu.cluster_block_id Ops #95245
Conversation
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-gpu Author: Pradeep Kumar (schwarzschild-radius) ChangesThis commit adds support for Full diff: https://github.com/llvm/llvm-project/pull/95245.diff 5 Files Affected:
diff --git a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
index eb81b6469746f..e7e55ab42d51f 100644
--- a/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
+++ b/mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
@@ -70,7 +70,7 @@ class GPU_IndexOp<string mnemonic, list<Trait> traits = []> :
def GPU_ClusterDimOp : GPU_IndexOp<"cluster_dim"> {
let description = [{
- Returns the number of thread blocks in the cluster along
+ Returns the number of cluster identifiers per grid along
the x, y, or z `dimension`.
Example:
@@ -81,6 +81,19 @@ def GPU_ClusterDimOp : GPU_IndexOp<"cluster_dim"> {
}];
}
+def GPU_ClusterDimBlocksOp : GPU_IndexOp<"cluster_dim_blocks"> {
+ let description = [{
+ Returns the number of thread blocks in the cluster along
+ the x, y, or z `dimension`.
+
+ Example:
+
+ ```mlir
+ %cDimBlocksX = gpu.cluster_dim_blocks x
+ ```
+ }];
+}
+
def GPU_ClusterIdOp : GPU_IndexOp<"cluster_id"> {
let description = [{
Returns the cluster id, i.e. the index of the current cluster within the
diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index 4daeeab093863..4d48b3de7a57e 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -160,9 +160,9 @@ def NVVM_ClusterDimZOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.nclusterid.z">;
def NVVM_BlockInClusterIdXOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.ctaid.x">;
def NVVM_BlockInClusterIdYOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.ctaid.y">;
def NVVM_BlockInClusterIdZOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.ctaid.z">;
-def NVVM_GridInClusterDimXOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.nctaid.x">;
-def NVVM_GridInClusterDimYOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.nctaid.y">;
-def NVVM_GridInClusterDimZOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.nctaid.z">;
+def NVVM_ClusterDimBlocksXOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.nctaid.x">;
+def NVVM_ClusterDimBlocksYOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.nctaid.y">;
+def NVVM_ClusterDimBlocksZOp : NVVM_SpecialRegisterOp<"read.ptx.sreg.cluster.nctaid.z">;
//===----------------------------------------------------------------------===//
// CTA index and across Cluster dimensions
diff --git a/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp b/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
index b95fba20a00cb..811f9efb62951 100644
--- a/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
+++ b/mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp
@@ -344,6 +344,8 @@ void mlir::populateGpuToNVVMConversionPatterns(LLVMTypeConverter &converter,
NVVM::ClusterIdYOp, NVVM::ClusterIdZOp>,
GPUIndexIntrinsicOpLowering<gpu::ClusterDimOp, NVVM::ClusterDimXOp,
NVVM::ClusterDimYOp, NVVM::ClusterDimZOp>,
+ GPUIndexIntrinsicOpLowering<gpu::ClusterDimBlocksOp, NVVM::ClusterDimBlocksXOp,
+ NVVM::ClusterDimBlocksYOp, NVVM::ClusterDimBlocksZOp>,
GPUIndexIntrinsicOpLowering<gpu::BlockIdOp, NVVM::BlockIdXOp,
NVVM::BlockIdYOp, NVVM::BlockIdZOp>,
GPUIndexIntrinsicOpLowering<gpu::GridDimOp, NVVM::GridDimXOp,
diff --git a/mlir/lib/Dialect/GPU/IR/InferIntRangeInterfaceImpls.cpp b/mlir/lib/Dialect/GPU/IR/InferIntRangeInterfaceImpls.cpp
index 69017efb9a0e6..80ea102c03bd2 100644
--- a/mlir/lib/Dialect/GPU/IR/InferIntRangeInterfaceImpls.cpp
+++ b/mlir/lib/Dialect/GPU/IR/InferIntRangeInterfaceImpls.cpp
@@ -86,6 +86,11 @@ static std::optional<uint64_t> getKnownLaunchDim(Op op, LaunchDims type) {
void ClusterDimOp::inferResultRanges(ArrayRef<ConstantIntRanges>,
SetIntRangeFn setResultRange) {
+ setResultRange(getResult(), getIndexRange(1, kMaxDim));
+}
+
+void ClusterDimBlocksOp::inferResultRanges(ArrayRef<ConstantIntRanges>,
+ SetIntRangeFn setResultRange) {
setResultRange(getResult(), getIndexRange(1, kMaxClusterDim));
}
diff --git a/mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir b/mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir
index 025282ec0d688..5c11d80178f72 100644
--- a/mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir
+++ b/mlir/test/Integration/GPU/CUDA/sm90/cga_cluster.mlir
@@ -22,9 +22,9 @@ module attributes {gpu.container_module} {
%cidX = gpu.cluster_id x
%cidY = gpu.cluster_id y
%cidZ = gpu.cluster_id z
- %cdimX = gpu.cluster_dim x
- %cdimY = gpu.cluster_dim y
- %cdimZ = gpu.cluster_dim z
+ %cdimX = gpu.cluster_dim_blocks x
+ %cdimY = gpu.cluster_dim_blocks y
+ %cdimZ = gpu.cluster_dim_blocks z
%bidX = gpu.block_id x
%bidY = gpu.block_id y
%bidZ = gpu.block_id z
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
cd6d4cf
to
4430b67
Compare
CC += @durga4github for viz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I left some comments
This commit adds support for `gpu.cluster_dim_blocks` and `gpu.cluster_block_id` Ops to represent number of blocks per cluster and block id inside a cluster respectively. Also, fixed the description of `gpu.cluster_dim` Op and updated the `cga_cluster.mlir` test file to use `gpu.cluster_dim_blocks`
4430b67
to
87b384b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Co-authored-by: Guray Ozen <[email protected]>
@durga4github can you please merge it? |
…lvm#95245) This commit adds support for `gpu.cluster_dim_blocks` and `gpu.cluster_block_id` Ops to represent number of blocks per cluster and block id inside a cluster respectively. Also, fixed the description of `gpu.cluster_dim` Op and updated the `cga_cluster.mlir` test file to use `gpu.cluster_dim_blocks` Co-authored-by: pradeepku <[email protected]> Co-authored-by: Guray Ozen <[email protected]>
This commit adds support for
gpu.cluster_dim_blocks
andgpu.cluster_block_id
Ops to represent number of blocks per cluster and block id inside a cluster respectively. Also, fixed the description ofgpu.cluster_dim
Op and updated thecga_cluster.mlir
test file to usegpu.cluster_dim_blocks