Skip to content

[mlir] cleanup of structured.tile* transform ops #67320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions mlir/docs/Tutorials/transform/Ch1.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,13 +119,13 @@ transform.sequence failures(propagate) {
%arg1: !transform.op<"linalg.matmul">,
%arg2: !transform.op<"linalg.elemwise_binary">):
// The actual tiling transformation takes tile sizes as attributes.
%loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
%loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
: (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)
transform.yield
}
```

The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_to_forall_op-transformtiletoforallop):
The transformation returns two handles, as indicated in its [documentation](https://mlir.llvm.org/docs/Dialects/Transform/#transformstructuredtile_using_forall-transformtiletoforallop):

* A handle to the `scf.forall` “multi-for” loop around tensors.
* A handle to `linalg.generic` operating on the subset of the original data.
Expand Down Expand Up @@ -176,7 +176,7 @@ transform.sequence failures(propagate) {
%arg1: !transform.op<"linalg.matmul">,
%arg2: !transform.op<"linalg.elemwise_binary">):
// The actual tiling transformation takes tile sizes as attributes.
%loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
%loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
: (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)

// This is trying to use an invalidated handle leading to undefined behavior.
Expand All @@ -203,7 +203,7 @@ matmul.mlir:26:9: note: handle to invalidated ops
%mm = transform.cast %matmul : !transform.op<"linalg.matmul"> to !transform.any_op
^
matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
%loop, %tiled = transform.structured.tile_to_forall_op %mm tile_sizes [4, 32]
%loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
```

One may observe that some operations such as `transform.cast` do not consume the operand (because they don’t erase the corresponding operation). So what would happen if we tried to use that operand instead?
Expand All @@ -219,7 +219,7 @@ transform.sequence failures(propagate) {
to !transform.any_op

// The actual tiling transformation takes tile sizes as attributes.
%loop, %tiled = transform.structured.tile_to_forall_op %arg1 tile_sizes [4, 32]
%loop, %tiled = transform.structured.tile_using_forall %arg1 tile_sizes [4, 32]
: (!transform.op<"linalg.matmul">) -> (!transform.any_op, !transform.any_op)

// Consuming an operand invalidates the consumed handle and any other handle that is
Expand All @@ -240,7 +240,7 @@ matmul.mlir:21:29: note: handle to invalidated ops
^bb0(%root: !transform.any_op, %matmul: !transform.op<"linalg.matmul">, %elemwise: !transform.op<"linalg.elemwise_binary">):
^
matmul.mlir:27:19: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
%loop, %tiled = transform.structured.tile_to_forall_op %mm tile_sizes [4, 32]
%loop, %tiled = transform.structured.tile_using_forall %mm tile_sizes [4, 32]
```

## Chaining Transformations with Handles
Expand All @@ -262,7 +262,7 @@ transform.sequence failures(propagate) {
// The actual tiling transformation takes tile sizes as attributes. It
// produces a handle to the loop generated during tiling.
%loop, %tiled_max =
transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
transform.structured.tile_using_forall %max tile_sizes [8, 32]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)

// We can now fuse the other operations into the loop. Here, we fuse
Expand Down Expand Up @@ -304,7 +304,7 @@ transform.sequence failures(propagate) {

// The actual tiling transformation takes tile sizes as attributes. It
// produces a handle to the loop generated during tiling.
%loop, %tiled = transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
%loop, %tiled = transform.structured.tile_using_forall %max tile_sizes [8, 32]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)

// We can now fuse the other operations into the loop. Here, we fuse
Expand All @@ -328,7 +328,7 @@ transform.sequence failures(propagate) {
// dialect. Otherwise, it is difficult to differentiate "add" and "max", both
// of which having the same kind.
%loop_2, %tiled_2 =
transform.structured.tile_to_forall_op %add_fused tile_sizes [4, 4]
transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)
%matmul_fused_2, %loop_3 =
transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
Expand All @@ -339,7 +339,7 @@ transform.sequence failures(propagate) {
// such as loops, use tiling to size 1 to materialize the outer loop that is
// going to be outlined.
%outline_target, %_ =
transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)
transform.structured.fuse_into_containing_op %matmul_fused_2
into %outline_target
Expand All @@ -361,7 +361,7 @@ test/Examples/transform/Ch1/invalidation-2.mlir:109:3: error: op uses a handle i
transform.test_print_remark_at_operand %outline_target, "outlined loop" : !transform.any_op
^
test/Examples/transform/Ch1/invalidation-2.mlir:102:25: note: handle to invalidated ops
%outline_target, %_ = transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
%outline_target, %_ = transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
^
test/Examples/transform/Ch1/invalidation-2.mlir:106:18: note: invalidated by this transform op that consumes its operand #0 and invalidates all handles to payload IR entities associated with this operand and entities nested in them
%func, %call = transform.loop.outline %outline_target {func_name = "outlined"}
Expand Down
6 changes: 3 additions & 3 deletions mlir/docs/Tutorials/transform/Ch2.md
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@ transform.sequence failures(propagate) {

// The actual tiling transformation takes tile sizes as attributes. It produces a
// handle to the loop generated during tiling.
%loop, %tiled = transform.structured.tile_to_forall_op %max tile_sizes [8, 32]
%loop, %tiled = transform.structured.tile_using_forall %max tile_sizes [8, 32]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)

// We can now fuse the other operations into the loop. Here, we fuse
Expand All @@ -311,15 +311,15 @@ transform.sequence failures(propagate) {
// "max" operation. This illustrates the precise targeting with the transform
// dialect. Otherwise, it is difficult to differentiate "add" and "max", both
// of which having the same kind.
%loop_2, %tiled_2 = transform.structured.tile_to_forall_op %add_fused tile_sizes [4, 4]
%loop_2, %tiled_2 = transform.structured.tile_using_forall %add_fused tile_sizes [4, 4]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)
%matmul_fused_2 = transform.structured.fuse_into_containing_op %matmul_fused into %loop_2
: (!transform.any_op, !transform.any_op) -> !transform.any_op

// Since outlining is currently only implemented for region-holding operations
// such as loops, use tiling to size 1 to materialize the outer loop that is
// going to be outlined.
%outline_target, %_ = transform.structured.tile_to_forall_op %tiled_2 tile_sizes [1]
%outline_target, %_ = transform.structured.tile_using_forall %tiled_2 tile_sizes [1]
: (!transform.any_op) -> (!transform.any_op, !transform.any_op)
transform.structured.fuse_into_containing_op %matmul_fused_2 into %outline_target
: (!transform.any_op, !transform.any_op) -> !transform.any_op
Expand Down
18 changes: 9 additions & 9 deletions mlir/docs/Tutorials/transform/ChH.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ Linalg as described below.
the inner loop having at most the given number of iterations. This can be
understood as loop _strip-mining_ or a degenerate case of tiling a single
dimension using any of `linalg.tile_` transform ops. We will be using
`transform.structured.tile_to_forall_op` as this kind of loop is best
`transform.structured.tile_using_forall` as this kind of loop is best
supported by bufferization and can also be turned into a parallel loop later
on. Unlike Halide, this doesn’t add new dimensions to the original
operation, but rather creates a loop around it and rewrites the operation
Expand Down Expand Up @@ -275,9 +275,9 @@ The remaining dimensions can be materialized as loops in one transformation.

```mlir
// [n y x c]
%co, %relu2 = transform.structured.tile_to_forall_op %relu
%co, %relu2 = transform.structured.tile_using_forall %relu
tile_sizes [0, 0, 0, 64]
%n_y_xo, %relu3 = transform.structured.tile_to_forall_op %relu2
%n_y_xo, %relu3 = transform.structured.tile_using_forall %relu2
tile_sizes [1, 1, 5, 0]
```

Expand Down Expand Up @@ -355,7 +355,7 @@ more than one dimension at the moment of writing.)

```mlir
%rz_ry_rx, %red_fill, %conv4, %comb
= transform.structured.tile_reduction_using_scf %conv3
= transform.structured.tile_reduction_using_for %conv3
// n y x c rz ry rx
by tile_sizes=[0, 0, 0, 0, 1, 1, 1]
```
Expand Down Expand Up @@ -386,10 +386,10 @@ dimension:

```mlir
// n y xi ci
%1, %c5 = transform.structured.tile_to_forall_op %conv4 tile_sizes [0, 0, 1, 16]
%2, %b4 = transform.structured.tile_to_forall_op %bias3 tile_sizes [0, 0, 1, 16]
%3, %r4 = transform.structured.tile_to_forall_op %relu3 tile_sizes [0, 0, 1, 16]
%4, %c2 = transform.structured.tile_to_forall_op %comb tile_sizes [0, 0, 1, 16]
%1, %c5 = transform.structured.tile_using_forall %conv4 tile_sizes [0, 0, 1, 16]
%2, %b4 = transform.structured.tile_using_forall %bias3 tile_sizes [0, 0, 1, 16]
%3, %r4 = transform.structured.tile_using_forall %relu3 tile_sizes [0, 0, 1, 16]
%4, %c2 = transform.structured.tile_using_forall %comb tile_sizes [0, 0, 1, 16]
```

Note that the combiner operation produced by reduction tiling is also tiled here.
Expand Down Expand Up @@ -638,7 +638,7 @@ bufferization invalidates all loop handles including to loops that we are
willing to unroll. This hurdle can be overcome by matching the payload IR
operations after bufferization to produce new handles. We will first change the
kind of loops produced in the schedule from `scf.for` to `scf.forall` to have
less operations to match by using `transform.structured.tile_to_forall_op`
less operations to match by using `transform.structured.tile_using_forall`
instead of `transform.structured.tile` when tiling with sizes `[0, 0, 1, 16]`.
Then we can match all `scf.forall` operations in the payload IR and transform
them into single-iterator `scf.for` loops _after bufferization_.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -618,10 +618,10 @@ def MultiTileSizesOp : Op<Transform_Dialect, "structured.multitile_sizes",
!transform.param<i64>, !transform.param<i64>
%low, %high = structured.split %target after %split { dimension = 1 }
: !transform.any_op, !transform.param<i64>
%tiled_low, %loop1 = structured.tile %low [0, %sz1]
%tiled_low, %loop1 = structured.tile_using_for %low [0, %sz1]
: (!transform.any_op, !transform.param<i64>)
-> (!transform.any_op, !transform.any_op)
%tiled_high, %loop2 = structured.tile %high [0, %sz2]
%tiled_high, %loop2 = structured.tile_using_for %high [0, %sz2]
: (!transform.any_op, !transform.param<i64>)
-> (!transform.any_op, !transform.any_op)
%common = merge_handles %tiled_low, %tiled_high : !transform.any_op
Expand Down Expand Up @@ -1514,10 +1514,10 @@ def SplitReductionOp : Op<Transform_Dialect, "structured.split_reduction",
}

//===----------------------------------------------------------------------===//
// TileReductionUsingScfOp
// TileReductionUsingForOp
//===----------------------------------------------------------------------===//

def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_using_scf",
def TileReductionUsingForOp : Op<Transform_Dialect, "structured.tile_reduction_using_for",
[FunctionalStyleTransformOpTrait, MemoryEffectsOpInterface,
TransformEachOpTrait, TransformOpInterface,
ReportTrackingListenerFailuresOpTrait]> {
Expand All @@ -1536,11 +1536,11 @@ def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_u

#### Return modes

This 4 returned handles point to:
- the parent for op,
Returns 4 handles associated with (in order):
- the fill op used to initialize the neutral element,
- the parallel tiled op and
- the result-combining op.
- the result-combining op,
- the parent `for` op.

#### Example:

Expand Down Expand Up @@ -1590,13 +1590,13 @@ def TileReductionUsingScfOp : Op<Transform_Dialect, "structured.tile_reduction_u
```
}];

// TODO: support mixed static-dynamic (see TileToForallOp).
// TODO: support mixed static-dynamic (see TileUsingForallOp).
let arguments = (ins TransformHandleTypeInterface:$target,
DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$tile_sizes);
let results = (outs TransformHandleTypeInterface:$for_op,
TransformHandleTypeInterface:$fill_op,
let results = (outs TransformHandleTypeInterface:$fill_op,
TransformHandleTypeInterface:$split_linalg_op,
TransformHandleTypeInterface:$combining_linalg_op);
TransformHandleTypeInterface:$combining_linalg_op,
TransformHandleTypeInterface:$for_op);

let builders = [
OpBuilder<(ins "Value":$target,
Expand Down Expand Up @@ -1644,11 +1644,11 @@ def TileReductionUsingForallOp :

#### Return modes

This 4 returned handles point to:
- the parent forall op,
Returns 4 handles associated with (in order):
- the fill op used to initialize the neutral element,
- the parallel tiled op and
- the result-combining op.
- the result-combining op,
- the parent `forall` op.

#### Example:

Expand Down Expand Up @@ -1694,15 +1694,15 @@ def TileReductionUsingForallOp :
```
}];

// TODO: support mixed static-dynamic (see TileToForallOp).
// TODO: support mixed static-dynamic (see TileUsingForallOp).
let arguments = (ins TransformHandleTypeInterface:$target,
DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$num_threads,
DefaultValuedAttr<DenseI64ArrayAttr, "{}">:$tile_sizes,
OptionalAttr<DeviceMappingArrayAttr>:$mapping);
let results = (outs TransformHandleTypeInterface:$forall_op,
TransformHandleTypeInterface:$fill_op,
let results = (outs TransformHandleTypeInterface:$fill_op,
TransformHandleTypeInterface:$split_linalg_op,
TransformHandleTypeInterface:$combining_linalg_op);
TransformHandleTypeInterface:$combining_linalg_op,
TransformHandleTypeInterface:$forall_op);

let builders = [
OpBuilder<(ins "Value":$target,
Expand Down Expand Up @@ -1732,10 +1732,10 @@ def TileReductionUsingForallOp :
}

//===----------------------------------------------------------------------===//
// TileOp
// TileUsingForOp
//===----------------------------------------------------------------------===//

def TileOp : Op<Transform_Dialect, "structured.tile",
def TileUsingForOp : Op<Transform_Dialect, "structured.tile_using_for",
[DeclareOpInterfaceMethods<TransformOpInterface>,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
ReportTrackingListenerFailuresOpTrait]> {
Expand Down Expand Up @@ -1820,11 +1820,11 @@ def TileOp : Op<Transform_Dialect, "structured.tile",
}

//===----------------------------------------------------------------------===//
// TileToForallOp
// TileUsingForallOp
//===----------------------------------------------------------------------===//

def TileToForallOp :
Op<Transform_Dialect, "structured.tile_to_forall_op",
def TileUsingForallOp :
Op<Transform_Dialect, "structured.tile_using_forall",
[AttrSizedOperandSegments,
DeclareOpInterfaceMethods<MemoryEffectsOpInterface>,
TransformOpInterface, ReportTrackingListenerFailuresOpTrait]> {
Expand All @@ -1834,9 +1834,9 @@ def TileToForallOp :
Tiling is applied by either specifying `num_threads` or `tile_size`. If
`num_threads` is specified, then the tile size for each dimension `i` is
calculated dynamically via `ceilDiv(dimSize[i], num_threads[i])`.
`num_threads` and `tile_size` can be either static index attributes or SSA
values of PDL operation handle type (or a mix thereof). Operation handles
must be mapped to exactly one op that has exactly one result of index type.
`num_threads` and `tile_size` can be either static index attributes or
operation handles (or a mix thereof). Operation handles must be mapped to
exactly one op that has exactly one result of index type.

Static zero tile sizes indicate that the dimension is not tiled and can be
thought of as tiling by the full size of data.
Expand Down Expand Up @@ -1872,15 +1872,15 @@ def TileToForallOp :

```
%0 = pdl_match @match_matmul in %arg1
%3:2 = transform.structured.tile_to_forall_op %0 num_threads [10, 20]
%3:2 = transform.structured.tile_using_forall %0 num_threads [10, 20]
```

#### Example using `tile_sizes`

```
%0 = pdl_match @match_matmul in %arg1
%sz = pdl_match @match_size_op in %arg1
%3:2 = transform.structured.tile_to_forall_op %0 tile_sizes [0, %sz, 20]
%3:2 = transform.structured.tile_using_forall %0 tile_sizes [0, %sz, 20]
```
}];

Expand All @@ -1892,8 +1892,8 @@ def TileToForallOp :
DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$static_num_threads,
DefaultValuedOptionalAttr<DenseI64ArrayAttr, "{}">:$static_tile_sizes,
OptionalAttr<DeviceMappingArrayAttr>:$mapping);
let results = (outs TransformHandleTypeInterface:$forall_op,
TransformHandleTypeInterface:$tiled_op);
let results = (outs TransformHandleTypeInterface:$tiled_op,
TransformHandleTypeInterface:$forall_op);

let builders = [
OpBuilder<(ins "Value":$target,
Expand Down
Loading