[mlir] cleanup of structured.tile* transform ops #67320

ftynse · 2023-09-25T12:29:41Z

Rename and restructure tiling-related transform ops from the structured extension to be more homogeneous. In particular, all ops now follow a consistent naming scheme:

transform.structured.tile_using_for;
transform.structured.tile_using_forall;
transform.structured.tile_reduction_using_for;
transform.structured.tile_reduction_using_forall.

This drops the "_op" naming artifact from tile_to_forall_op that shouldn't have been included in the first place, consistently specifies the name of the control flow op to be produced for loops (instead of tile_reduction_using_scf since scf.forall also belongs to scf), and opts for the using connector to avoid ambiguity.

The loops produced by tiling are now systematically placed as trailing results of the transform op. While this required changing 3 out of 4 ops (except for tile_using_for), this is the only choice that makes sense when producing multiple scf.for ops that can be associated with a variadic number of handles. This choice is also most consistent with other transform ops from the structured extension, in particular with fusion ops, that produce the structured op as the leading result and the loop as the trailing result.

Rename and restructure tiling-related transform ops from the structured extension to be more homogeneous. In particular, all ops now follow a consistent naming scheme: - `transform.structured.tile_using_for`; - `transform.structured.tile_using_forall`; - `transform.structured.tile_reduction_using_for`; - `transform.structured.tile_reduction_using_forall`. This drops the "_op" naming artifact from `tile_to_forall_op` that shouldn't have been included in the first place, consistently specifies the name of the control flow op to be produced for loops (instead of `tile_reduction_using_scf` since `scf.forall` also belongs to `scf), and opts for the `using` connector to avoid ambiguity. The loops produced by tiling are now systematically placed as *trailing* results of the transform op. While this required changing 3 out of 4 ops (except for `tile_using_for`), this is the only choice that makes sense when producing multiple `scf.for` ops that can be assocaited with a variadic number of handles. This choice is also most consistent with *other* transform ops from the structured extension, in praticular with fusion ops, that produce the structured op as the leading result and the loop as the trailing result.

llvmbot · 2023-09-25T12:31:20Z

@llvm/pr-subscribers-mlir-vector
@llvm/pr-subscribers-mlir-tensor
@llvm/pr-subscribers-mlir-linalg
@llvm/pr-subscribers-mlir-affine
@llvm/pr-subscribers-mlir-scf
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir-llvm

Changes

Rename and restructure tiling-related transform ops from the structured extension to be more homogeneous. In particular, all ops now follow a consistent naming scheme:

transform.structured.tile_using_for;
transform.structured.tile_using_forall;
transform.structured.tile_reduction_using_for;
transform.structured.tile_reduction_using_forall.

This drops the "_op" naming artifact from tile_to_forall_op that shouldn't have been included in the first place, consistently specifies the name of the control flow op to be produced for loops (instead of tile_reduction_using_scf since scf.forall also belongs to scf), and opts for the using` connector to avoid ambiguity.

The loops produced by tiling are now systematically placed as trailing results of the transform op. While this required changing 3 out of 4 ops (except for tile_using_for), this is the only choice that makes sense when producing multiple scf.for ops that can be associated with a variadic number of handles. This choice is also most consistent with other transform ops from the structured extension, in particular with fusion ops, that produce the structured op as the leading result and the loop as the trailing result.

Patch is 142.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67320.diff

52 Files Affected:

(modified) mlir/docs/Tutorials/transform/Ch1.md (+11-11)
(modified) mlir/docs/Tutorials/transform/Ch2.md (+3-3)
(modified) mlir/docs/Tutorials/transform/ChH.md (+9-9)
(modified) mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td (+30-30)
(modified) mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp (+64-65)
(modified) mlir/python/mlir/dialects/_structured_transform_ops_ext.py (+7-5)
(modified) mlir/test/Dialect/Affine/loop-tiling-parametric.mlir (+8-8)
(modified) mlir/test/Dialect/Affine/loop-tiling.mlir (+2-2)
(modified) mlir/test/Dialect/GPU/transform-gpu-failing.mlir (+1-1)
(modified) mlir/test/Dialect/LLVM/transform-e2e.mlir (+1-1)
(modified) mlir/test/Dialect/Linalg/generalize-tensor-pack-tile.mlir (+3-3)
(modified) mlir/test/Dialect/Linalg/generalize-tensor-unpack-tile.mlir (+3-3)
(modified) mlir/test/Dialect/Linalg/matmul-shared-memory-padding.mlir (+4-4)
(modified) mlir/test/Dialect/Linalg/multisize-tiling-full.mlir (+8-8)
(modified) mlir/test/Dialect/Linalg/promote.mlir (+2-2)
(modified) mlir/test/Dialect/Linalg/promotion_options.mlir (+1-1)
(modified) mlir/test/Dialect/Linalg/tile-conv.mlir (+1-1)
(modified) mlir/test/Dialect/Linalg/tile-indexed.mlir (+2-2)
(modified) mlir/test/Dialect/Linalg/tile-softmax.mlir (+4-4)
(modified) mlir/test/Dialect/Linalg/tile-tensors.mlir (+3-3)
(modified) mlir/test/Dialect/Linalg/tile-to-forall.mlir (+11-11)
(modified) mlir/test/Dialect/Linalg/transform-op-compose-masked-vectorize-and-cleanups.mlir (+2-2)
(modified) mlir/test/Dialect/Linalg/transform-op-fuse-into-containing.mlir (+1-1)
(modified) mlir/test/Dialect/Linalg/transform-op-fuse.mlir (+1-1)
(modified) mlir/test/Dialect/Linalg/transform-op-hoist-pad-build-packing-loop-nest.mlir (+5-5)
(modified) mlir/test/Dialect/Linalg/transform-op-hoist-pad.mlir (+5-5)
(modified) mlir/test/Dialect/Linalg/transform-op-scalarize.mlir (+1-1)
(modified) mlir/test/Dialect/Linalg/transform-op-tile.mlir (+8-8)
(modified) mlir/test/Dialect/Linalg/transform-ops.mlir (+1-1)
(modified) mlir/test/Dialect/Linalg/transform-patterns.mlir (+10-10)
(modified) mlir/test/Dialect/Linalg/transform-tile-and-fuse.mlir (+3-3)
(modified) mlir/test/Dialect/Linalg/transform-tile-reduction.mlir (+10-10)
(modified) mlir/test/Dialect/SCF/transform-loop-fuse-sibling.mlir (+8-8)
(modified) mlir/test/Dialect/SCF/transform-op-take-assumed-branch.mlir (+1-1)
(modified) mlir/test/Dialect/Tensor/tiling.mlir (+17-17)
(modified) mlir/test/Dialect/Transform/ops.mlir (+4-4)
(modified) mlir/test/Dialect/Transform/selective-targeting.mlir (+1-1)
(modified) mlir/test/Dialect/Vector/transform-vector.mlir (+1-1)
(modified) mlir/test/Examples/transform/Ch1/invalidation-1.mlir (+2-2)
(modified) mlir/test/Examples/transform/Ch1/invalidation-2.mlir (+3-3)
(modified) mlir/test/Examples/transform/Ch1/sequence.mlir (+3-3)
(modified) mlir/test/Examples/transform/Ch2/sequence.mlir (+3-3)
(modified) mlir/test/Examples/transform/Ch3/sequence.mlir (+3-3)
(modified) mlir/test/Examples/transform/ChH/full.mlir (+4-4)
(modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-1d-call.mlir (+1-1)
(modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-1d-nwc-wcf-call.mlir (+1-1)
(modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-2d-call.mlir (+1-1)
(modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-2d-nhwc-hwcf-call.mlir (+1-1)
(modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-3d-call.mlir (+1-1)
(modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-3d-ndhwc-dhwcf-call.mlir (+1-1)
(modified) mlir/test/Integration/Dialect/Linalg/CPU/test-tensor-matmul.mlir (+1-1)
(modified) mlir/test/python/dialects/transform_structured_ext.py (+27-25)

diff --git a/mlir/docs/Tutorials/transform/Ch1.md b/mlir/docs/Tutorials/transform/Ch1.md
index e3ec5e769c48db1..95b37eb6ca41304 100644
--- a/mlir/docs/Tutorials/transform/Ch1.md
+++ b/mlir/docs/Tutorials/transform/Ch1.md
@@ -119,13 +119,13 @@ transform.sequence failures(propagate) {
      %arg1: !transform.op<"linalg.matmul">,
      %arg2: !transform.op<"linalg.elemwise_binary">):
   // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
     : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
   transform.yield
 }
 ```
 
-The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_to_forall_op-transformtiletoforallop):
+The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_using_forall-transformtiletoforallop):
 
 *   A handle to the `scf.forall` “multi-for” loop around tensors.
 *   A handle to `linalg.generic` operating on the subset of the original data.
@@ -176,7 +176,7 @@ transform.sequence failures(propagate) {
      %arg1: !transform.op<"linalg.matmul">,
      %arg2: !transform.op<"linalg.elemwise_binary">):
   // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
       : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
 
   // This is trying to use an invalidated handle leading to undefined behavior.
@@ -203,7 +203,7 @@ matmul.mlir:26:9: note: handle to invalidated ops
   %mm = transform.cast %matmul : !transform.op<"linalg.matmul"> to !transform.any_op
         ^
 matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
-  %loop, %tiled = transform.structured.tile_to_forall_op %mm tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
 ```
 
 One may observe that some operations such as `transform.cast` do not consume the operand (because they don’t erase the corresponding operation). So what would happen if we tried to use that operand instead? 
@@ -219,7 +219,7 @@ transform.sequence failures(propagate) {
       to !transform.any_op
 
   // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
     : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
 
   // Consuming an operand invalidates the consumed handle and any other handle that is
@@ -240,7 +240,7 @@ matmul.mlir:21:29: note: handle to invalidated ops
 ^bb0(%root: !transform.any_op, %matmul: !transform.op<"linalg.matmul">, %elemwise: !transform.op<"linalg.elemwise_binary">):
                             ^
 matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
-  %loop, %tiled = transform.structured.tile_to_forall_op %mm tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
 ```
 
 ## Chaining Transformations with Handles
@@ -262,7 +262,7 @@ transform.sequence failures(propagate) {
   // The actual tiling transformation takes tile sizes as attributes. It
   // produces a handle to the loop generated during tiling.
   %loop, %tiled_max =
-      transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
+      transform.structured.tile_using_forall %max tile_sizes [8, 32]
         : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
 
   // We can now fuse the other operations into the loop. Here, we fuse
@@ -304,7 +304,7 @@ transform.sequence failures(propagate) {
 
   // The actual tiling transformation takes tile sizes as attributes. It
   // produces a handle to the loop generated during tiling.
-  %loop, %tiled = transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %max tile_sizes [8, 32]
       : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
 
   // We can now fuse the other operations into the loop. Here, we fuse
@@ -328,7 +328,7 @@ transform.sequence failures(propagate) {
   // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
   // of which having the same kind.
   %loop_2, %tiled_2 =
-      transform.structured.tile_to_forall_op %add_fused tile_sizes [4, 4]
+      transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
         : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
   %matmul_fused_2, %loop_3 =
       transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
@@ -339,7 +339,7 @@ transform.sequence failures(propagate) {
   // such as loops, use tiling to size 1 to materialize the outer loop that is
   // going to be outlined.
   %outline_target, %_ =
-      transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
+      transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
         : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
   transform.structured.fuse_into_containing_op %matmul_fused_2
       into %outline_target
@@ -361,7 +361,7 @@ test/Examples/transform/Ch1/invalidation-2.mlir:109:3: error: op uses a handle i
   transform.test_print_remark_at_operand %outline_target, "outlined loop" : !transform.any_op
   ^
 test/Examples/transform/Ch1/invalidation-2.mlir:102:25: note: handle to invalidated ops
-  %outline_target, %_ = transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
+  %outline_target, %_ = transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
                         ^
 test/Examples/transform/Ch1/invalidation-2.mlir:106:18: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
   %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
diff --git a/mlir/docs/Tutorials/transform/Ch2.md b/mlir/docs/Tutorials/transform/Ch2.md
index 310d2ae923f093c..8d5076e3ef40477 100644
--- a/mlir/docs/Tutorials/transform/Ch2.md
+++ b/mlir/docs/Tutorials/transform/Ch2.md
@@ -292,7 +292,7 @@ transform.sequence failures(propagate) {
 
   // The actual tiling transformation takes tile sizes as attributes. It produces a
   // handle to the loop generated during tiling.
-  %loop, %tiled = transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %max tile_sizes [8, 32]
       : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
 
   // We can now fuse the other operations into the loop. Here, we fuse
@@ -311,7 +311,7 @@ transform.sequence failures(propagate) {
   // "max" operation. This illustrates the precise targeting with the transform
   // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
   // of which having the same kind.
-  %loop_2, %tiled_2 = transform.structured.tile_to_forall_op %add_fused tile_sizes [4, 4]
+  %loop_2, %tiled_2 = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
       : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
   %matmul_fused_2 = transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
       : (!transform.any_op, !transform.any_op) -> !transform.any_op
@@ -319,7 +319,7 @@ transform.sequence failures(propagate) {
   // Since outlining is currently only implemented for region-holding operations
   // such as loops, use tiling to size 1 to materialize the outer loop that is
   // going to be outlined.
-  %outline_target, %_ = transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
+  %outline_target, %_ = transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
       : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
   transform.structured.fuse_into_containing_op %matmul_fused_2 into %outline_target
       : (!transform.any_op, !transform.any_op) -> !transform.any_op
diff --git a/mlir/docs/Tutorials/transform/ChH.md b/mlir/docs/Tutorials/transform/ChH.md
index ed41d225f7276d1..0696853dda1cf75 100644
--- a/mlir/docs/Tutorials/transform/ChH.md
+++ b/mlir/docs/Tutorials/transform/ChH.md
@@ -218,7 +218,7 @@ Linalg as described below.
     the inner loop having at most the given number of iterations. This can be
     understood as loop _strip-mining_ or a degenerate case of tiling a single
     dimension using any of `linalg.tile_` transform ops. We will be using
-    `transform.structured.tile_to_forall_op` as this kind of loop is best
+    `transform.structured.tile_using_forall` as this kind of loop is best
     supported by bufferization and can also be turned into a parallel loop later
     on. Unlike Halide, this doesn’t add new dimensions to the original
     operation, but rather creates a loop around it and rewrites the operation
@@ -275,9 +275,9 @@ The remaining dimensions can be materialized as loops in one transformation.
 
 ```mlir
     //                                                             [n  y  x  c]
-    %co, %relu2 = transform.structured.tile_to_forall_op %relu
+    %co, %relu2 = transform.structured.tile_using_forall %relu
                                                         tile_sizes [0, 0, 0, 64]
-    %n_y_xo, %relu3 = transform.structured.tile_to_forall_op %relu2
+    %n_y_xo, %relu3 = transform.structured.tile_using_forall %relu2
                                                         tile_sizes [1, 1, 5, 0]
 ```
 
@@ -355,7 +355,7 @@ more than one dimension at the moment of writing.)
 
 ```mlir
 %rz_ry_rx, %red_fill, %conv4, %comb
-  = transform.structured.tile_reduction_using_scf %conv3
+  = transform.structured.tile_reduction_using_for %conv3
 //               n  y  x  c  rz ry rx
   by tile_sizes=[0, 0, 0, 0, 1, 1, 1]
 ```
@@ -386,10 +386,10 @@ dimension:
 
 ```mlir
 //                                                                  n  y  xi ci
-%1, %c5 = transform.structured.tile_to_forall_op %conv4 tile_sizes [0, 0, 1, 16]
-%2, %b4 = transform.structured.tile_to_forall_op %bias3 tile_sizes [0, 0, 1, 16]
-%3, %r4 = transform.structured.tile_to_forall_op %relu3 tile_sizes [0, 0, 1, 16]
-%4, %c2 = transform.structured.tile_to_forall_op %comb  tile_sizes [0, 0, 1, 16]
+%1, %c5 = transform.structured.tile_using_forall %conv4 tile_sizes [0, 0, 1, 16]
+%2, %b4 = transform.structured.tile_using_forall %bias3 tile_sizes [0, 0, 1, 16]
+%3, %r4 = transform.structured.tile_using_forall %relu3 tile_sizes [0, 0, 1, 16]
+%4, %c2 = transform.structured.tile_using_forall %comb  tile_sizes [0, 0, 1, 16]
 ```
 
 Note that the combiner operation produced by reduction tiling is also tiled here.
@@ -638,7 +638,7 @@ bufferization invalidates all loop handles including to loops that we are
 willing to unroll. This hurdle can be overcome by matching the payload IR
 operations after bufferization to produce new handles. We will first change the
 kind of loops produced in the schedule from `scf.for` to `scf.forall` to have
-less operations to match by using `transform.structured.tile_to_forall_op`
+less operations to match by using `transform.structured.tile_using_forall`
 instead of `transform.structured.tile` when tiling with sizes `[0, 0, 1, 16]`.
 Then we can match all `scf.forall` operations in the payload IR and transform
 them into single-iterator `scf.for` loops _after bufferization_.
diff --git a/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td b/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
index 0fca130a58e3940..1ff88d036bc036c 100644
--- a/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
+++ b/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
@@ -618,10 +618,10 @@ def MultiTileSizesOp : Op<Transform_Dialect, "structured.multitile_sizes",
                          !transform.param<i64>, !transform.param<i64>
     %low, %high = structured.split %target after %split { dimension = 1 }
                 : !transform.any_op, !transform.param<i64>
-    %tiled_low, %loop1 = structured.tile %low [0, %sz1]
+    %tiled_low, %loop1 = structured.tile_using_for %low [0, %sz1]
                        : (!transform.any_op, !transform.param<i64>)
                       -> (!transform.any_op, !transform.any_op)
-    %tiled_high, %loop2 = structured.tile %high [0, %sz2]
+    %tiled_high, %loop2 = structured.tile_using_for %high [0, %sz2]
                         : (!transform.any_op, !transform.param<i64>)
                        -> (!transform.any_op, !transform.any_op)
     %common = merge_handles %tiled_low, %tiled_high : !transform.any_op
@@ -1514,10 +1514,10 @@ def SplitReductionOp : Op<Transform_Dialect, "structured.split_reduction",
 }
 
 //===----------------------------------------------------------------------===//
-// TileReductionUsingScfOp
+// TileReductionUsingForOp
 //===----------------------------------------------------------------------===//
 
-def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_using_scf",
+def TileReductionUsingForOp : Op<Transform_Dialect, "structured.tile_reduction_using_for",
        [FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface,
         TransformEachOpTrait, TransformOpInterface,
         ReportTrackingListenerFailuresOpTrait]> {
@@ -1536,11 +1536,11 @@ def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_u
 
     #### Return modes
 
-    This 4 returned handles point to:
-      - the parent for op,
+    Returns 4 handles associated with (in order):
       - the fill op used to initialize the neutral element,
       - the parallel tiled op and
-      - the result-combining op.
+      - the result-combining op,
+      - the parent `for` op.
 
     #### Example:
 
@@ -1590,13 +1590,13 @@ def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_u
     ```
   }];
 
-  // TODO: support mixed static-dynamic (see TileToForallOp).
+  // TODO: support mixed static-dynamic (see TileUsingForallOp).
   let arguments = (ins TransformHandleTypeInterface:$target,
                    DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$tile_sizes);
-  let results = (outs TransformHandleTypeInterface:$for_op,
-                      TransformHandleTypeInterface:$fill_op,
+  let results = (outs TransformHandleTypeInterface:$fill_op,
                       TransformHandleTypeInterface:$split_linalg_op,
-                      TransformHandleTypeInterface:$combining_linalg_op);
+                      TransformHandleTypeInterface:$combining_linalg_op,
+                      TransformHandleTypeInterface:$for_op);
 
   let builders = [
     OpBuilder<(ins "Value":$target,
@@ -1644,11 +1644,11 @@ def TileReductionUsingForallOp :
 
     #### Return modes
 
-    This 4 returned handles point to:
-      - the parent forall op,
+    Returns 4 handles associated with (in order):
       - the fill op used to initialize the neutral element,
       - the parallel tiled op and
-      - the result-combining op.
+      - the result-combining op,
+      - the parent `forall` op.
 
     #### Example:
 
@@ -1694,15 +1694,15 @@ def TileReductionUsingForallOp :
     ```
   }];
 
-  // TODO: support mixed static-dynamic (see TileToForallOp).
+  // TODO: support mixed static-dynamic (see TileUsingForallOp).
   let arguments = (ins TransformHandleTypeInterface:$target,
                    DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$num_threads,
                    DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$tile_sizes,
                    OptionalAttr<DeviceMappingArrayAttr>:$mapping);
-  let results = (outs TransformHandleTypeInterface:$forall_op,
-                      TransformHandleTypeInterface:$fill_op,
+  let results = (outs TransformHandleTypeInterface:$fill_op,
                       TransformHandleTypeInterface:$split_linalg_op,
-                      TransformHandleTypeInterface:$combining_linalg_op);
+                      TransformHandleTypeInterface:$combining_linalg_op,
+                      TransformHandleTypeInterface:$forall_op);
 
   let builders = [
     OpBuilder<(ins "Value":$target,
@@ -1732,10 +1732,10 @@ def TileReductionUsingForallOp :
 }
 
 //===----------------------------------------------------------------------===//
-// TileOp
+// TileUsingForOp
 //===----------------------------------------------------------------------===//
 
-def TileOp : Op<Transform_Dialect, "structured.tile",
+def TileUsingForOp : Op<Transform_Dialect, "structured.tile_using_for",
        [DeclareOpInterfaceMethods<TransformOpInterface>,
         DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
         ReportTrackingListenerFailuresOpTrait]> {
@@ -1820,11 +1820,11 @@ def TileOp : Op<Transform_Dialect, "structured.tile",
 }
 
 //===----------------------------------------------------------------------===//
-// TileToForallOp
+// TileUsingForallOp
 //===----------------------------------------------------------------------===//
 
-def TileToForallOp :
-    Op<Transform_Dialect, "structured.tile_to_forall_op",
+def TileUsingForallOp :
+    Op<Transform_Dialect, "structured.tile_using_forall",
       [AttrSizedOperandSegments,
        DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
        TransformOpInterface, ReportTrackingListenerFailuresOpTrait]> {
@@ -1834,9 +1834,9 @@ def TileToForallOp :
     Tiling is applied by either specifying `num_threads` or `tile_size`. If
     `num_threads` is specified, then the tile size for each dimension `i` is
     calculated dynamically via `ceilDiv(dimSize[i], num_threads[i])`.
-    `num_threads` and `tile_size` can be either static index attributes or SSA
-    values of PDL operation handle type (or a mix thereof). Operation handles
-    must be mapped to exactly one op that has exactly one result of index type.
+    `num_threads` and `tile_size` can be either static index attributes or 
+    operation handles (or a mix thereof). Operation handles must be mapped to
+    exactly one op that has exactly one result of index type.
 
     Static zero tile sizes indicate that the dimension is not tiled and can be
     thought of as tiling by the full size of data.
@@ -1872,7 +1872,7 @@ def TileToForallOp :
 
     ```
     %0 = pdl_match @match_matmul in %arg1
-    %3:2 = transform.structured.tile_to_forall_op %0 num_threads [10, 20]
+    %3:2 = transform.structured.tile_using_forall %0 num_threads [10, 20]
     ```
 
     #### Example using `tile_sizes`
@@ -1880,7 +1880,7 @@ def TileToForallOp :
     ```
     %0 = pdl_match @match_matmul in %arg1
     %sz = pdl_match @match_size_op in %arg1
-    %3:2 = transform.structured.tile_to_forall_op %0 tile_sizes [0, %sz, 20]
+    %3:2 = transform.structured.tile_using_forall %0 tile_sizes [0, %sz, 20]
     ```
   }];
 
@@ -1892,8 +1892,8 @@ def TileToForallOp :
                    DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$static_num_threads,
                    DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$static_tile_sizes,
                    OptionalAttr<DeviceMappingArrayAttr>:$mapping);
-  let results = (outs TransformHandleTypeInterface:$forall_op,
-                      TransformHandleTypeInterface:$tiled_op);
+  let results = (outs TransformHandleTypeInterface:$tiled_op,
+                      TransformHandleTypeInterface:$forall_op);
 
   let builders = [
     OpBuilder<(ins "Value":$target,
diff --git a/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp b/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
index 8bfd731fcc81213..95b810e81dea9be 100644
--- a/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
+++ b/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
@@ -2293,17 +2293,17 @@ DiagnosedSilenceableFailure transform::SplitReductionOp::applyToOne(
 }
 
 //===--------------------------------------------...
[truncated]

nicolasvasilache

Great, thanks!

chelini

Thank you.

Replace (in tests and docs): %forall, %tiled = transform.structured.tile_using_forall with: %tiled, %forall = transform.structured.tile_using_forall Applies similar change to (in the TD tutorial): transform.structured.fuse_into_containing_op This update makes sure that the tests/documentation are consistent with the Op specifications. Follow-up for llvm#67320 which updated the order of the return handles for `tile_using_forall`.

Replace (in tests and docs): %forall, %tiled = transform.structured.tile_using_forall with (updated order of return handles): %tiled, %forall = transform.structured.tile_using_forall Similar change is applied to (in the TD tutorial): transform.structured.fuse_into_containing_op This update makes sure that the tests/documentation are consistent with the Op specifications. Follow-up for #67320 which updated the order of the return handles for `tile_using_forall`.

ftynse requested review from ingomueller-net, nicolasvasilache, martin-luecke, chelini and qedawkins September 25, 2023 12:29

llvmbot added mlir:llvm mlir:linalg mlir:gpu mlir:affine mlir:vectorops mlir:python MLIR Python bindings mlir mlir:tensor mlir:scf mlir:vector mlir:afine labels Sep 25, 2023

nicolasvasilache approved these changes Sep 25, 2023

View reviewed changes

chelini approved these changes Sep 25, 2023

View reviewed changes

M4ximumPizza approved these changes Sep 26, 2023

View reviewed changes

ftynse merged commit 96ff025 into llvm:main Sep 26, 2023

ftynse deleted the tile-cleanup branch September 26, 2023 07:14

banach-space mentioned this pull request Jan 4, 2024

[mlir][TD] Fix the order of return handles #76929

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir] cleanup of structured.tile* transform ops #67320

[mlir] cleanup of structured.tile* transform ops #67320

Uh oh!

ftynse commented Sep 25, 2023 •

edited

Loading

Uh oh!

llvmbot commented Sep 25, 2023 •

edited

Loading

Uh oh!

nicolasvasilache left a comment

Uh oh!

chelini left a comment

Uh oh!

Uh oh!

[mlir] cleanup of structured.tile* transform ops #67320

[mlir] cleanup of structured.tile* transform ops #67320

Uh oh!

Conversation

ftynse commented Sep 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicolasvasilache left a comment

Choose a reason for hiding this comment

Uh oh!

chelini left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ftynse commented Sep 25, 2023 •

edited

Loading

llvmbot commented Sep 25, 2023 •

edited

Loading