Skip to content

[mlir] cleanup of structured.tile* transform ops #67320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 26, 2023
Merged

Conversation

ftynse
Copy link
Member

@ftynse ftynse commented Sep 25, 2023

Rename and restructure tiling-related transform ops from the structured extension to be more homogeneous. In particular, all ops now follow a consistent naming scheme:

  • transform.structured.tile_using_for;
  • transform.structured.tile_using_forall;
  • transform.structured.tile_reduction_using_for;
  • transform.structured.tile_reduction_using_forall.

This drops the "_op" naming artifact from tile_to_forall_op that shouldn't have been included in the first place, consistently specifies the name of the control flow op to be produced for loops (instead of tile_reduction_using_scf since scf.forall also belongs to scf), and opts for the using connector to avoid ambiguity.

The loops produced by tiling are now systematically placed as trailing results of the transform op. While this required changing 3 out of 4 ops (except for tile_using_for), this is the only choice that makes sense when producing multiple scf.for ops that can be associated with a variadic number of handles. This choice is also most consistent with other transform ops from the structured extension, in particular with fusion ops, that produce the structured op as the leading result and the loop as the trailing result.

Rename and restructure tiling-related transform ops from the structured
extension to be more homogeneous. In particular, all ops now follow a
consistent naming scheme:

 - `transform.structured.tile_using_for`;
 - `transform.structured.tile_using_forall`;
 - `transform.structured.tile_reduction_using_for`;
 - `transform.structured.tile_reduction_using_forall`.

This drops the "_op" naming artifact from `tile_to_forall_op` that
shouldn't have been included in the first place, consistently specifies
the name of the control flow op to be produced for loops (instead of
`tile_reduction_using_scf` since `scf.forall` also belongs to `scf), and
opts for the `using` connector to avoid ambiguity.

The loops produced by tiling are now systematically placed as *trailing*
results of the transform op. While this required changing 3 out of 4 ops
(except for `tile_using_for`), this is the only choice that makes sense
when producing multiple `scf.for` ops that can be assocaited with a
variadic number of handles. This choice is also most consistent with
*other* transform ops from the structured extension, in praticular with
fusion ops, that produce the structured op as the leading result and the
loop as the trailing result.
@llvmbot
Copy link
Member

llvmbot commented Sep 25, 2023

@llvm/pr-subscribers-mlir-vector
@llvm/pr-subscribers-mlir-tensor
@llvm/pr-subscribers-mlir-linalg
@llvm/pr-subscribers-mlir-affine
@llvm/pr-subscribers-mlir-scf
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir-llvm

Changes

Rename and restructure tiling-related transform ops from the structured extension to be more homogeneous. In particular, all ops now follow a consistent naming scheme:

  • transform.structured.tile_using_for;
  • transform.structured.tile_using_forall;
  • transform.structured.tile_reduction_using_for;
  • transform.structured.tile_reduction_using_forall.

This drops the "_op" naming artifact from tile_to_forall_op that shouldn't have been included in the first place, consistently specifies the name of the control flow op to be produced for loops (instead of tile_reduction_using_scf since scf.forall also belongs to scf), and opts for the using` connector to avoid ambiguity.

The loops produced by tiling are now systematically placed as trailing results of the transform op. While this required changing 3 out of 4 ops (except for tile_using_for), this is the only choice that makes sense when producing multiple scf.for ops that can be associated with a variadic number of handles. This choice is also most consistent with other transform ops from the structured extension, in particular with fusion ops, that produce the structured op as the leading result and the loop as the trailing result.


Patch is 142.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67320.diff

52 Files Affected:

  • (modified) mlir/docs/Tutorials/transform/Ch1.md (+11-11)
  • (modified) mlir/docs/Tutorials/transform/Ch2.md (+3-3)
  • (modified) mlir/docs/Tutorials/transform/ChH.md (+9-9)
  • (modified) mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td (+30-30)
  • (modified) mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp (+64-65)
  • (modified) mlir/python/mlir/dialects/_structured_transform_ops_ext.py (+7-5)
  • (modified) mlir/test/Dialect/Affine/loop-tiling-parametric.mlir (+8-8)
  • (modified) mlir/test/Dialect/Affine/loop-tiling.mlir (+2-2)
  • (modified) mlir/test/Dialect/GPU/transform-gpu-failing.mlir (+1-1)
  • (modified) mlir/test/Dialect/LLVM/transform-e2e.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/generalize-tensor-pack-tile.mlir (+3-3)
  • (modified) mlir/test/Dialect/Linalg/generalize-tensor-unpack-tile.mlir (+3-3)
  • (modified) mlir/test/Dialect/Linalg/matmul-shared-memory-padding.mlir (+4-4)
  • (modified) mlir/test/Dialect/Linalg/multisize-tiling-full.mlir (+8-8)
  • (modified) mlir/test/Dialect/Linalg/promote.mlir (+2-2)
  • (modified) mlir/test/Dialect/Linalg/promotion_options.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/tile-conv.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/tile-indexed.mlir (+2-2)
  • (modified) mlir/test/Dialect/Linalg/tile-softmax.mlir (+4-4)
  • (modified) mlir/test/Dialect/Linalg/tile-tensors.mlir (+3-3)
  • (modified) mlir/test/Dialect/Linalg/tile-to-forall.mlir (+11-11)
  • (modified) mlir/test/Dialect/Linalg/transform-op-compose-masked-vectorize-and-cleanups.mlir (+2-2)
  • (modified) mlir/test/Dialect/Linalg/transform-op-fuse-into-containing.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/transform-op-fuse.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/transform-op-hoist-pad-build-packing-loop-nest.mlir (+5-5)
  • (modified) mlir/test/Dialect/Linalg/transform-op-hoist-pad.mlir (+5-5)
  • (modified) mlir/test/Dialect/Linalg/transform-op-scalarize.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/transform-op-tile.mlir (+8-8)
  • (modified) mlir/test/Dialect/Linalg/transform-ops.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/transform-patterns.mlir (+10-10)
  • (modified) mlir/test/Dialect/Linalg/transform-tile-and-fuse.mlir (+3-3)
  • (modified) mlir/test/Dialect/Linalg/transform-tile-reduction.mlir (+10-10)
  • (modified) mlir/test/Dialect/SCF/transform-loop-fuse-sibling.mlir (+8-8)
  • (modified) mlir/test/Dialect/SCF/transform-op-take-assumed-branch.mlir (+1-1)
  • (modified) mlir/test/Dialect/Tensor/tiling.mlir (+17-17)
  • (modified) mlir/test/Dialect/Transform/ops.mlir (+4-4)
  • (modified) mlir/test/Dialect/Transform/selective-targeting.mlir (+1-1)
  • (modified) mlir/test/Dialect/Vector/transform-vector.mlir (+1-1)
  • (modified) mlir/test/Examples/transform/Ch1/invalidation-1.mlir (+2-2)
  • (modified) mlir/test/Examples/transform/Ch1/invalidation-2.mlir (+3-3)
  • (modified) mlir/test/Examples/transform/Ch1/sequence.mlir (+3-3)
  • (modified) mlir/test/Examples/transform/Ch2/sequence.mlir (+3-3)
  • (modified) mlir/test/Examples/transform/Ch3/sequence.mlir (+3-3)
  • (modified) mlir/test/Examples/transform/ChH/full.mlir (+4-4)
  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-1d-call.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-1d-nwc-wcf-call.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-2d-call.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-2d-nhwc-hwcf-call.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-3d-call.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/test-conv-3d-ndhwc-dhwcf-call.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/test-tensor-matmul.mlir (+1-1)
  • (modified) mlir/test/python/dialects/transform_structured_ext.py (+27-25)
diff --git a/mlir/docs/Tutorials/transform/Ch1.md b/mlir/docs/Tutorials/transform/Ch1.md
index e3ec5e769c48db1..95b37eb6ca41304 100644
--- a/mlir/docs/Tutorials/transform/Ch1.md
+++ b/mlir/docs/Tutorials/transform/Ch1.md
@@ -119,13 +119,13 @@ transform.sequence failures(propagate) {
      %arg1: !transform.op<"linalg.matmul">,
      %arg2: !transform.op<"linalg.elemwise_binary">):
   // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
     : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
   transform.yield
 }
 ```
 
-The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_to_forall_op-transformtiletoforallop):
+The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_using_forall-transformtiletoforallop):
 
 *   A handle to the `scf.forall` “multi-for” loop around tensors.
 *   A handle to `linalg.generic` operating on the subset of the original data.
@@ -176,7 +176,7 @@ transform.sequence failures(propagate) {
      %arg1: !transform.op<"linalg.matmul">,
      %arg2: !transform.op<"linalg.elemwise_binary">):
   // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
       : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
 
   // This is trying to use an invalidated handle leading to undefined behavior.
@@ -203,7 +203,7 @@ matmul.mlir:26:9: note: handle to invalidated ops
   %mm = transform.cast %matmul : !transform.op<"linalg.matmul"> to !transform.any_op
         ^
 matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
-  %loop, %tiled = transform.structured.tile_to_forall_op %mm tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
 ```
 
 One may observe that some operations such as `transform.cast` do not consume the operand (because they don’t erase the corresponding operation). So what would happen if we tried to use that operand instead? 
@@ -219,7 +219,7 @@ transform.sequence failures(propagate) {
       to !transform.any_op
 
   // The actual tiling transformation takes tile sizes as attributes.
-  %loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
     : (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
 
   // Consuming an operand invalidates the consumed handle and any other handle that is
@@ -240,7 +240,7 @@ matmul.mlir:21:29: note: handle to invalidated ops
 ^bb0(%root: !transform.any_op, %matmul: !transform.op<"linalg.matmul">, %elemwise: !transform.op<"linalg.elemwise_binary">):
                             ^
 matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
-  %loop, %tiled = transform.structured.tile_to_forall_op %mm tile_sizes [4, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
 ```
 
 ## Chaining Transformations with Handles
@@ -262,7 +262,7 @@ transform.sequence failures(propagate) {
   // The actual tiling transformation takes tile sizes as attributes. It
   // produces a handle to the loop generated during tiling.
   %loop, %tiled_max =
-      transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
+      transform.structured.tile_using_forall %max tile_sizes [8, 32]
         : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
 
   // We can now fuse the other operations into the loop. Here, we fuse
@@ -304,7 +304,7 @@ transform.sequence failures(propagate) {
 
   // The actual tiling transformation takes tile sizes as attributes. It
   // produces a handle to the loop generated during tiling.
-  %loop, %tiled = transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %max tile_sizes [8, 32]
       : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
 
   // We can now fuse the other operations into the loop. Here, we fuse
@@ -328,7 +328,7 @@ transform.sequence failures(propagate) {
   // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
   // of which having the same kind.
   %loop_2, %tiled_2 =
-      transform.structured.tile_to_forall_op %add_fused tile_sizes [4, 4]
+      transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
         : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
   %matmul_fused_2, %loop_3 =
       transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
@@ -339,7 +339,7 @@ transform.sequence failures(propagate) {
   // such as loops, use tiling to size 1 to materialize the outer loop that is
   // going to be outlined.
   %outline_target, %_ =
-      transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
+      transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
         : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
   transform.structured.fuse_into_containing_op %matmul_fused_2
       into %outline_target
@@ -361,7 +361,7 @@ test/Examples/transform/Ch1/invalidation-2.mlir:109:3: error: op uses a handle i
   transform.test_print_remark_at_operand %outline_target, "outlined loop" : !transform.any_op
   ^
 test/Examples/transform/Ch1/invalidation-2.mlir:102:25: note: handle to invalidated ops
-  %outline_target, %_ = transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
+  %outline_target, %_ = transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
                         ^
 test/Examples/transform/Ch1/invalidation-2.mlir:106:18: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
   %func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
diff --git a/mlir/docs/Tutorials/transform/Ch2.md b/mlir/docs/Tutorials/transform/Ch2.md
index 310d2ae923f093c..8d5076e3ef40477 100644
--- a/mlir/docs/Tutorials/transform/Ch2.md
+++ b/mlir/docs/Tutorials/transform/Ch2.md
@@ -292,7 +292,7 @@ transform.sequence failures(propagate) {
 
   // The actual tiling transformation takes tile sizes as attributes. It produces a
   // handle to the loop generated during tiling.
-  %loop, %tiled = transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
+  %loop, %tiled = transform.structured.tile_using_forall %max tile_sizes [8, 32]
       : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
 
   // We can now fuse the other operations into the loop. Here, we fuse
@@ -311,7 +311,7 @@ transform.sequence failures(propagate) {
   // "max" operation. This illustrates the precise targeting with the transform
   // dialect. Otherwise, it is difficult to differentiate "add" and "max", both
   // of which having the same kind.
-  %loop_2, %tiled_2 = transform.structured.tile_to_forall_op %add_fused tile_sizes [4, 4]
+  %loop_2, %tiled_2 = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
       : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
   %matmul_fused_2 = transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
       : (!transform.any_op, !transform.any_op) -> !transform.any_op
@@ -319,7 +319,7 @@ transform.sequence failures(propagate) {
   // Since outlining is currently only implemented for region-holding operations
   // such as loops, use tiling to size 1 to materialize the outer loop that is
   // going to be outlined.
-  %outline_target, %_ = transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
+  %outline_target, %_ = transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
       : (!transform.any_op) -> (!transform.any_op, !transform.any_op)
   transform.structured.fuse_into_containing_op %matmul_fused_2 into %outline_target
       : (!transform.any_op, !transform.any_op) -> !transform.any_op
diff --git a/mlir/docs/Tutorials/transform/ChH.md b/mlir/docs/Tutorials/transform/ChH.md
index ed41d225f7276d1..0696853dda1cf75 100644
--- a/mlir/docs/Tutorials/transform/ChH.md
+++ b/mlir/docs/Tutorials/transform/ChH.md
@@ -218,7 +218,7 @@ Linalg as described below.
     the inner loop having at most the given number of iterations. This can be
     understood as loop _strip-mining_ or a degenerate case of tiling a single
     dimension using any of `linalg.tile_` transform ops. We will be using
-    `transform.structured.tile_to_forall_op` as this kind of loop is best
+    `transform.structured.tile_using_forall` as this kind of loop is best
     supported by bufferization and can also be turned into a parallel loop later
     on. Unlike Halide, this doesn’t add new dimensions to the original
     operation, but rather creates a loop around it and rewrites the operation
@@ -275,9 +275,9 @@ The remaining dimensions can be materialized as loops in one transformation.
 
 ```mlir
     //                                                             [n  y  x  c]
-    %co, %relu2 = transform.structured.tile_to_forall_op %relu
+    %co, %relu2 = transform.structured.tile_using_forall %relu
                                                         tile_sizes [0, 0, 0, 64]
-    %n_y_xo, %relu3 = transform.structured.tile_to_forall_op %relu2
+    %n_y_xo, %relu3 = transform.structured.tile_using_forall %relu2
                                                         tile_sizes [1, 1, 5, 0]
 ```
 
@@ -355,7 +355,7 @@ more than one dimension at the moment of writing.)
 
 ```mlir
 %rz_ry_rx, %red_fill, %conv4, %comb
-  = transform.structured.tile_reduction_using_scf %conv3
+  = transform.structured.tile_reduction_using_for %conv3
 //               n  y  x  c  rz ry rx
   by tile_sizes=[0, 0, 0, 0, 1, 1, 1]
 ```
@@ -386,10 +386,10 @@ dimension:
 
 ```mlir
 //                                                                  n  y  xi ci
-%1, %c5 = transform.structured.tile_to_forall_op %conv4 tile_sizes [0, 0, 1, 16]
-%2, %b4 = transform.structured.tile_to_forall_op %bias3 tile_sizes [0, 0, 1, 16]
-%3, %r4 = transform.structured.tile_to_forall_op %relu3 tile_sizes [0, 0, 1, 16]
-%4, %c2 = transform.structured.tile_to_forall_op %comb  tile_sizes [0, 0, 1, 16]
+%1, %c5 = transform.structured.tile_using_forall %conv4 tile_sizes [0, 0, 1, 16]
+%2, %b4 = transform.structured.tile_using_forall %bias3 tile_sizes [0, 0, 1, 16]
+%3, %r4 = transform.structured.tile_using_forall %relu3 tile_sizes [0, 0, 1, 16]
+%4, %c2 = transform.structured.tile_using_forall %comb  tile_sizes [0, 0, 1, 16]
 ```
 
 Note that the combiner operation produced by reduction tiling is also tiled here.
@@ -638,7 +638,7 @@ bufferization invalidates all loop handles including to loops that we are
 willing to unroll. This hurdle can be overcome by matching the payload IR
 operations after bufferization to produce new handles. We will first change the
 kind of loops produced in the schedule from `scf.for` to `scf.forall` to have
-less operations to match by using `transform.structured.tile_to_forall_op`
+less operations to match by using `transform.structured.tile_using_forall`
 instead of `transform.structured.tile` when tiling with sizes `[0, 0, 1, 16]`.
 Then we can match all `scf.forall` operations in the payload IR and transform
 them into single-iterator `scf.for` loops _after bufferization_.
diff --git a/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td b/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
index 0fca130a58e3940..1ff88d036bc036c 100644
--- a/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
+++ b/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
@@ -618,10 +618,10 @@ def MultiTileSizesOp : Op<Transform_Dialect, "structured.multitile_sizes",
                          !transform.param<i64>, !transform.param<i64>
     %low, %high = structured.split %target after %split { dimension = 1 }
                 : !transform.any_op, !transform.param<i64>
-    %tiled_low, %loop1 = structured.tile %low [0, %sz1]
+    %tiled_low, %loop1 = structured.tile_using_for %low [0, %sz1]
                        : (!transform.any_op, !transform.param<i64>)
                       -> (!transform.any_op, !transform.any_op)
-    %tiled_high, %loop2 = structured.tile %high [0, %sz2]
+    %tiled_high, %loop2 = structured.tile_using_for %high [0, %sz2]
                         : (!transform.any_op, !transform.param<i64>)
                        -> (!transform.any_op, !transform.any_op)
     %common = merge_handles %tiled_low, %tiled_high : !transform.any_op
@@ -1514,10 +1514,10 @@ def SplitReductionOp : Op<Transform_Dialect, "structured.split_reduction",
 }
 
 //===----------------------------------------------------------------------===//
-// TileReductionUsingScfOp
+// TileReductionUsingForOp
 //===----------------------------------------------------------------------===//
 
-def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_using_scf",
+def TileReductionUsingForOp : Op<Transform_Dialect, "structured.tile_reduction_using_for",
        [FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface,
         TransformEachOpTrait, TransformOpInterface,
         ReportTrackingListenerFailuresOpTrait]> {
@@ -1536,11 +1536,11 @@ def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_u
 
     #### Return modes
 
-    This 4 returned handles point to:
-      - the parent for op,
+    Returns 4 handles associated with (in order):
       - the fill op used to initialize the neutral element,
       - the parallel tiled op and
-      - the result-combining op.
+      - the result-combining op,
+      - the parent `for` op.
 
     #### Example:
 
@@ -1590,13 +1590,13 @@ def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_u
     ```
   }];
 
-  // TODO: support mixed static-dynamic (see TileToForallOp).
+  // TODO: support mixed static-dynamic (see TileUsingForallOp).
   let arguments = (ins TransformHandleTypeInterface:$target,
                    DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$tile_sizes);
-  let results = (outs TransformHandleTypeInterface:$for_op,
-                      TransformHandleTypeInterface:$fill_op,
+  let results = (outs TransformHandleTypeInterface:$fill_op,
                       TransformHandleTypeInterface:$split_linalg_op,
-                      TransformHandleTypeInterface:$combining_linalg_op);
+                      TransformHandleTypeInterface:$combining_linalg_op,
+                      TransformHandleTypeInterface:$for_op);
 
   let builders = [
     OpBuilder<(ins "Value":$target,
@@ -1644,11 +1644,11 @@ def TileReductionUsingForallOp :
 
     #### Return modes
 
-    This 4 returned handles point to:
-      - the parent forall op,
+    Returns 4 handles associated with (in order):
       - the fill op used to initialize the neutral element,
       - the parallel tiled op and
-      - the result-combining op.
+      - the result-combining op,
+      - the parent `forall` op.
 
     #### Example:
 
@@ -1694,15 +1694,15 @@ def TileReductionUsingForallOp :
     ```
   }];
 
-  // TODO: support mixed static-dynamic (see TileToForallOp).
+  // TODO: support mixed static-dynamic (see TileUsingForallOp).
   let arguments = (ins TransformHandleTypeInterface:$target,
                    DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$num_threads,
                    DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$tile_sizes,
                    OptionalAttr<DeviceMappingArrayAttr>:$mapping);
-  let results = (outs TransformHandleTypeInterface:$forall_op,
-                      TransformHandleTypeInterface:$fill_op,
+  let results = (outs TransformHandleTypeInterface:$fill_op,
                       TransformHandleTypeInterface:$split_linalg_op,
-                      TransformHandleTypeInterface:$combining_linalg_op);
+                      TransformHandleTypeInterface:$combining_linalg_op,
+                      TransformHandleTypeInterface:$forall_op);
 
   let builders = [
     OpBuilder<(ins "Value":$target,
@@ -1732,10 +1732,10 @@ def TileReductionUsingForallOp :
 }
 
 //===----------------------------------------------------------------------===//
-// TileOp
+// TileUsingForOp
 //===----------------------------------------------------------------------===//
 
-def TileOp : Op<Transform_Dialect, "structured.tile",
+def TileUsingForOp : Op<Transform_Dialect, "structured.tile_using_for",
        [DeclareOpInterfaceMethods<TransformOpInterface>,
         DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
         ReportTrackingListenerFailuresOpTrait]> {
@@ -1820,11 +1820,11 @@ def TileOp : Op<Transform_Dialect, "structured.tile",
 }
 
 //===----------------------------------------------------------------------===//
-// TileToForallOp
+// TileUsingForallOp
 //===----------------------------------------------------------------------===//
 
-def TileToForallOp :
-    Op<Transform_Dialect, "structured.tile_to_forall_op",
+def TileUsingForallOp :
+    Op<Transform_Dialect, "structured.tile_using_forall",
       [AttrSizedOperandSegments,
        DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
        TransformOpInterface, ReportTrackingListenerFailuresOpTrait]> {
@@ -1834,9 +1834,9 @@ def TileToForallOp :
     Tiling is applied by either specifying `num_threads` or `tile_size`. If
     `num_threads` is specified, then the tile size for each dimension `i` is
     calculated dynamically via `ceilDiv(dimSize[i], num_threads[i])`.
-    `num_threads` and `tile_size` can be either static index attributes or SSA
-    values of PDL operation handle type (or a mix thereof). Operation handles
-    must be mapped to exactly one op that has exactly one result of index type.
+    `num_threads` and `tile_size` can be either static index attributes or 
+    operation handles (or a mix thereof). Operation handles must be mapped to
+    exactly one op that has exactly one result of index type.
 
     Static zero tile sizes indicate that the dimension is not tiled and can be
     thought of as tiling by the full size of data.
@@ -1872,7 +1872,7 @@ def TileToForallOp :
 
     ```
     %0 = pdl_match @match_matmul in %arg1
-    %3:2 = transform.structured.tile_to_forall_op %0 num_threads [10, 20]
+    %3:2 = transform.structured.tile_using_forall %0 num_threads [10, 20]
     ```
 
     #### Example using `tile_sizes`
@@ -1880,7 +1880,7 @@ def TileToForallOp :
     ```
     %0 = pdl_match @match_matmul in %arg1
     %sz = pdl_match @match_size_op in %arg1
-    %3:2 = transform.structured.tile_to_forall_op %0 tile_sizes [0, %sz, 20]
+    %3:2 = transform.structured.tile_using_forall %0 tile_sizes [0, %sz, 20]
     ```
   }];
 
@@ -1892,8 +1892,8 @@ def TileToForallOp :
                    DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$static_num_threads,
                    DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$static_tile_sizes,
                    OptionalAttr<DeviceMappingArrayAttr>:$mapping);
-  let results = (outs TransformHandleTypeInterface:$forall_op,
-                      TransformHandleTypeInterface:$tiled_op);
+  let results = (outs TransformHandleTypeInterface:$tiled_op,
+                      TransformHandleTypeInterface:$forall_op);
 
   let builders = [
     OpBuilder<(ins "Value":$target,
diff --git a/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp b/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
index 8bfd731fcc81213..95b810e81dea9be 100644
--- a/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
+++ b/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
@@ -2293,17 +2293,17 @@ DiagnosedSilenceableFailure transform::SplitReductionOp::applyToOne(
 }
 
 //===--------------------------------------------...
[truncated]

Copy link
Contributor

@nicolasvasilache nicolasvasilache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!

Copy link
Contributor

@chelini chelini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

@ftynse ftynse merged commit 96ff025 into llvm:main Sep 26, 2023
@ftynse ftynse deleted the tile-cleanup branch September 26, 2023 07:14
banach-space added a commit to banach-space/llvm-project that referenced this pull request Jan 4, 2024
Replace (in tests and docs):

  %forall, %tiled = transform.structured.tile_using_forall

with:

  %tiled, %forall = transform.structured.tile_using_forall

Applies similar change to (in the TD tutorial):

  transform.structured.fuse_into_containing_op

This update makes sure that the tests/documentation are consistent with
the Op specifications. Follow-up for llvm#67320 which updated the order of
the return handles for `tile_using_forall`.
banach-space added a commit that referenced this pull request Jan 4, 2024
Replace (in tests and docs):

    %forall, %tiled = transform.structured.tile_using_forall

with (updated order of return handles):

    %tiled, %forall = transform.structured.tile_using_forall

Similar change is applied to (in the TD tutorial):

    transform.structured.fuse_into_containing_op

This update makes sure that the tests/documentation are consistent with
the Op specifications. Follow-up for #67320 which updated the order of
the return handles for `tile_using_forall`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants