[mlir][xegpu] XeGPU distribution patterns for load_nd, store_nd, and create_nd_tdesc. #119783

charithaintc · 2024-12-12T22:38:29Z

This introduces SIMT distribution patterns for XeGPU load_nd, store_nd, and create_nd_tdesc operations. For these operations, TensorDescType is not distributed. This PR is based on @kurapov-peter's earlier draft #112945.

…u-distribution-charitha

github-actions · 2024-12-12T22:42:00Z

✅ With the latest revision this PR passed the C/C++ code formatter.

…aritha

charithaintc · 2025-02-04T20:45:12Z

@kurapov-peter @chencha3 @adam-smnk @Jianhui-Li Could you please review this PR.

kurapov-peter

Thanks for reviving this!

Jianhui-Li · 2025-02-05T16:58:39Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUDistribute.cpp

+///   %view = memref.subview %r#0[0, %laneid] [4, 1] [1, 1]
+///                               : memref<4x8xf32> to memref<4x1xf32>
+///   %td = xegpu.create_nd_tdesc %view[0, 0]: memref<4x1xf32>
+///                                 -> !xegpu.tensor_desc<4x1xf32>


The comments need to be change as well, as we don't need memref.subview.

Jianhui-Li · 2025-02-05T16:59:59Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUDistribute.cpp

+using namespace mlir;
+
+namespace {
+bool divisible(APInt lhs, APInt rhs) { return !lhs.urem(rhs); }


How about change the file name to be XeGPUSubgroupDistribute.cpp, to be more explicit. Since we also have a notion of "workgroup distribute".

everything under our control is changed to Subgroup along with class names, pass names and test cases.

Jianhui-Li · 2025-02-05T17:00:38Z

mlir/test/Dialect/XeGPU/xegpu-distribute.mlir

@@ -0,0 +1,79 @@
+// RUN: mlir-opt -test-xegpu-distribute -split-input-file %s | FileCheck %s


How about change the file name to be xegpu-subgroup-distribute.cpp, to be more explicit? Since we also have a notion of "workgroup distribute".

renamed to subgroup-distribute.mlir. Since we are already in XeGPU directory, I think xegpu prefix is redundant here. I don't see other dialects using same naming convention. We need to fix some of the existing test case name like xegpu-fold-alias-ops.mlir to just fold-alias-ops.mlir

Jianhui-Li · 2025-02-05T17:08:38Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUDistribute.cpp

+///         !xegpu.tensor_desc<4x8xf32>, vector<4x8xf32>
+///     vector.yield %arg0, %arg1
+///   }
+///   xegpu.store_nd %r#0, %r#1: vector<4x1xf32>,


this should be load_nd?

yes. fixed it. sorry I missed it.

Jianhui-Li · 2025-02-05T17:26:44Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUDistribute.cpp

+}
+
+LogicalResult
+WarpOpTensorDescOp::matchAndRewrite(gpu::WarpExecuteOnLane0Op warpOp,


Better to stick to "subgroup" prefix since XeGPU uses "subgroup" terminology, which is counterpart of warp.

charithaintc · 2025-02-06T15:28:03Z

@adam-smnk Can you please help with the review and merging this?

adam-smnk · 2025-02-06T16:51:09Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp

+#define DEBUG_TYPE "xegpu-distribute"
+#define DBGS() (llvm::dbgs() << "[" DEBUG_TYPE "]: ")


nit: looks unused so could be removed

adam-smnk · 2025-02-06T16:55:18Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp

+  auto layout = sgMap.getWiLayout();
+  auto shape = originalT.getShape();
+  for (const auto [l, o] : llvm::zip_equal(layout, shape)) {
+    if (!divisible(APInt(64, o), APInt(64, l)))


Why do we need to go through APInt for this?

adam-smnk · 2025-02-06T16:56:56Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp

+/// still contain the original op that will not be used by the yield op (and
+/// should be cleaned up later with dce). The yield op will bypass the
+/// create_nd_tdesc's arguments. Tensor descriptor is not distributed because it
+/// is a uniform value accorss all work items within the subgroup.


Suggested change

/// is a uniform value accorss all work items within the subgroup.

/// is a uniform value across all work items within the subgroup.

adam-smnk · 2025-02-06T17:07:19Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp

+  llvm::SmallVector<int64_t, 2> distributedShape;
+  auto layout = sgMap.getWiLayout();
+  auto shape = originalT.getShape();
+  for (const auto [l, o] : llvm::zip_equal(layout, shape)) {


nit: could you use more descriptive variable names?

adam-smnk · 2025-02-06T17:11:37Z

mlir/test/Dialect/XeGPU/subgroup-distribute.mlir

Could you also add a few invalid test cases? Like 1D with sg_map, no map etc.

adam-smnk · 2025-02-06T17:14:34Z

mlir/test/lib/Dialect/XeGPU/CMakeLists.txt

+  MLIRXeGPUTransforms
+  MLIRXeGPUDialect
+  MLIRSupport
+  )


nit: missing newline

adam-smnk · 2025-02-06T17:16:56Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp

+                                       "Failed to distribute the type");
+  VectorType newVectorType = distributedTypeOrFailure.value();
+
+  auto distributedDescTypeOrFailure = getDistributedTensorDescType(


Do we need this at all? I think it TensorDesc can't be ever scattered for nd ops

adam-smnk · 2025-02-06T17:17:42Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp

+    return rewriter.notifyMatchFailure(
+        descOp, "the tensor descriptor lacks sg_map attribute");
+
+  auto distributedDescTypeOrFailure = getDistributedTensorDescType(


Could you add a test case for this?

adam-smnk · 2025-02-06T17:29:02Z

mlir/test/Dialect/XeGPU/subgroup-distribute.mlir

@@ -0,0 +1,79 @@
+// RUN: mlir-opt -test-xegpu-subgroup-distribute -split-input-file %s | FileCheck %s
+
+#sg_map_16 = #xegpu.sg_map<wi_layout = [1, 16], wi_data = [1, 1]>


Just a sanity check, does the distribution work fine with wi_data different than 1?

adam-smnk · 2025-02-06T17:34:02Z

mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp

+FailureOr<xegpu::TensorDescType>
+getDistributedTensorDescType(xegpu::TensorDescType originalT,
+                             xegpu::SGMapAttr sgMap,
+                             xegpu::MemorySpace memSpace) {


I take that memSpace is there to propagate it to the newly created type. If so, it's unused atm.
Anyway, does it need to be a separate argument at all or could it be taken directly from originalT?

charithaintc · 2025-02-07T19:14:51Z

@adam-smnk Thanks very much for the review. It looks like PR needs more changes to support corner cases like wi_data = [1, 2]. I am working on it now.

kurapov-peter and others added 7 commits December 9, 2024 20:25

[MLIR] Create GPU utils library & move distribution utils

dab6841

Merge remote-tracking branch 'petr_llvm/distribution-utils' into xegp…

fe745c6

…u-distribution-charitha

pass added

f6cd50a

fix

1c06920

fix

9888c84

fix

491625d

fix

07f9f9f

charithaintc changed the title ~~[MLIR][XeGPU] Xegpu distribution patterns for load_nd, store_nd, and create_nd_tdesc.~~ [MLIR][XeGPU][Draft] Xegpu distribution patterns for load_nd, store_nd, and create_nd_tdesc. Dec 12, 2024

charithaintc added 5 commits December 12, 2024 22:45

fix

b842f33

fix

69cbc3b

Merge branch 'main' into xegpu-distribution-charitha

b7cb16f

fix

e7ca3cd

Merge branch 'main' into xegpu-distribution-charitha

8234edd

charithaintc changed the title ~~[MLIR][XeGPU][Draft] Xegpu distribution patterns for load_nd, store_nd, and create_nd_tdesc.~~ [MLIR][XeGPU][Draft] XeGPU distribution patterns for load_nd, store_nd, and create_nd_tdesc. Dec 13, 2024

charithaintc added 13 commits January 30, 2025 21:00

Merge branch 'main' into xegpu-distribution-charitha

2f4b748

sync

b443c71

fix comments

6f11f3c

Merge branch 'main' into xegpu-distribution-charitha

36c5b46

add mem side effects interface

7d1c7a6

Merge branch 'xegpu-mem-effects' into xegpu-distribution-charitha

eb7ee36

add mem side effects interface

263d72d

add mem side effects interface

1b0bba7

Merge branch 'main' into xegpu-mem-effects

91fa249

Merge branch 'xegpu-mem-effects' into xegpu-distribution-charitha

38ee43c

Merge remote-tracking branch 'origin/main' into xegpu-distribution-ch…

ae2a3fe

…aritha

fix issues

2d664e8

Merge branch 'main' into xegpu-distribution-charitha

615f22d

charithaintc changed the title ~~[MLIR][XeGPU][Draft] XeGPU distribution patterns for load_nd, store_nd, and create_nd_tdesc.~~ [mlir][xegpu] XeGPU distribution patterns for load_nd, store_nd, and create_nd_tdesc. Feb 4, 2025

charithaintc marked this pull request as ready for review February 4, 2025 20:44

kurapov-peter approved these changes Feb 4, 2025

View reviewed changes

Jianhui-Li reviewed Feb 5, 2025

View reviewed changes

Jianhui-Li approved these changes Feb 5, 2025

View reviewed changes

charithaintc added 3 commits February 5, 2025 20:08

Merge branch 'main' into xegpu-distribution-charitha

983dd4d

fix comments

4afbff9

fix

48fc6d5

adam-smnk reviewed Feb 6, 2025

View reviewed changes

charithaintc closed this Mar 26, 2025

		@@ -0,0 +1,79 @@
		// RUN: mlir-opt -test-xegpu-distribute -split-input-file %s \| FileCheck %s

		#define DEBUG_TYPE "xegpu-distribute"
		#define DBGS() (llvm::dbgs() << "[" DEBUG_TYPE "]: ")

	/// is a uniform value accorss all work items within the subgroup.
	/// is a uniform value across all work items within the subgroup.

		@@ -0,0 +1,79 @@
		// RUN: mlir-opt -test-xegpu-subgroup-distribute -split-input-file %s \| FileCheck %s

		#sg_map_16 = #xegpu.sg_map<wi_layout = [1, 16], wi_data = [1, 1]>

[mlir][xegpu] XeGPU distribution patterns for load_nd, store_nd, and create_nd_tdesc. #119783

[mlir][xegpu] XeGPU distribution patterns for load_nd, store_nd, and create_nd_tdesc. #119783

Uh oh!

Conversation

charithaintc commented Dec 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charithaintc commented Feb 4, 2025

Uh oh!

kurapov-peter left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jianhui-Li Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jianhui-Li Feb 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charithaintc commented Feb 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charithaintc commented Feb 7, 2025

Uh oh!

Uh oh!

charithaintc commented Dec 12, 2024 •

edited

Loading

github-actions bot commented Dec 12, 2024 •

edited

Loading

Jianhui-Li Feb 5, 2025 •

edited

Loading

Jianhui-Li Feb 5, 2025 •

edited

Loading