Skip to content

[mlir][linalg][nfc] Update pack-dynamic-inner-tile.mlir #116788

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 21, 2024

Conversation

banach-space
Copy link
Contributor

Following on from #116373, updates "pack-dynamic-inner-tile.mlir" to use
TD Ops for all transformations except for lowering to LLVM.

This is an intermediate step before introducing vectorization.

Following on from llvm#116373, updates "pack-dynamic-inner-tile.mlir" to use
TD Ops for all transformations except for lowering to LLVM.

This is an intermediate step before introducing vectorization.
@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-mlir-linalg

Author: Andrzej Warzyński (banach-space)

Changes

Following on from #116373, updates "pack-dynamic-inner-tile.mlir" to use
TD Ops for all transformations except for lowering to LLVM.

This is an intermediate step before introducing vectorization.


Full diff: https://github.com/llvm/llvm-project/pull/116788.diff

1 Files Affected:

  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir (+21-5)
diff --git a/mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir b/mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir
index 0428ada86041da..e214847d17c61c 100644
--- a/mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir
+++ b/mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir
@@ -2,9 +2,7 @@
 // DEFINE: -transform-interpreter -test-transform-dialect-erase-schedule |\
 // DEFINE:  mlir-opt --test-linalg-transform-patterns="test-decompose-tensor-pack"\
 // DEFINE:    --test-transform-dialect-erase-schedule \
-// DEFINE:    -one-shot-bufferize="bufferize-function-boundaries" \
-// DEFINE:    -buffer-deallocation-pipeline="private-function-dynamic-ownership" \
-// DEFINE:    -cse -canonicalize -test-lower-to-llvm -o %t
+// DEFINE:    -test-lower-to-llvm -o %t
 // DEFINE: %{entry_point} = main
 // DEFINE: %{run} = mlir-cpu-runner %t -e %{entry_point} -entry-point-result=void \
 // DEFINE:    -shared-libs=%mlir_runner_utils,%mlir_c_runner_utils
@@ -84,12 +82,30 @@ func.func private @pack(%A: tensor<7x16xi32>) {
 }
 
 module @transforms attributes { transform.with_named_sequence } {
-  transform.named_sequence @__transform_main(%module: !transform.any_op {transform.readonly}) {
+  transform.named_sequence @__transform_main(%module: !transform.any_op {transform.consume}) {
     %pack = transform.structured.match ops{["tensor.pack"]} in %module : (!transform.any_op) -> !transform.any_op
 
-    %tiled_linalg_op_p, %loops:2 = transform.structured.tile_using_for %pack tile_sizes [1, 1]
+    // 1. Tile so that we can decompose tensor.pack into tensor.pad,
+    // linalg.transpose, etc (see step 2).
+    %tiled_pack_op_p, %loops:2 = transform.structured.tile_using_for %pack tile_sizes [1, 1]
        : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op)
 
+    // 2. Decompose the tiled Op into tensor.pad etc
+    %func_1 = transform.get_parent_op %tiled_pack_op_p {isolated_from_above} : (!transform.any_op) -> !transform.any_op
+    transform.apply_patterns to %func_1 {
+      transform.apply_patterns.linalg.decompose_pack_unpack
+    } : !transform.any_op
+
+    // 3. Bufferize before lowering to LLVM
+    %bufferize = transform.bufferization.one_shot_bufferize %module
+      {bufferize_function_boundaries=true} : (!transform.any_op) -> !transform.any_op
+
+    // 4. Canonicalize
+    %func_2 = transform.structured.match ops{["func.func"]} in %bufferize : (!transform.any_op) -> !transform.op<"func.func">
+    transform.apply_patterns to %func_2 {
+      transform.apply_patterns.canonicalization
+    } : !transform.op<"func.func">
+
     transform.yield
   }
 }

@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-mlir

Author: Andrzej Warzyński (banach-space)

Changes

Following on from #116373, updates "pack-dynamic-inner-tile.mlir" to use
TD Ops for all transformations except for lowering to LLVM.

This is an intermediate step before introducing vectorization.


Full diff: https://github.com/llvm/llvm-project/pull/116788.diff

1 Files Affected:

  • (modified) mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir (+21-5)
diff --git a/mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir b/mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir
index 0428ada86041da..e214847d17c61c 100644
--- a/mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir
+++ b/mlir/test/Integration/Dialect/Linalg/CPU/pack-dynamic-inner-tile.mlir
@@ -2,9 +2,7 @@
 // DEFINE: -transform-interpreter -test-transform-dialect-erase-schedule |\
 // DEFINE:  mlir-opt --test-linalg-transform-patterns="test-decompose-tensor-pack"\
 // DEFINE:    --test-transform-dialect-erase-schedule \
-// DEFINE:    -one-shot-bufferize="bufferize-function-boundaries" \
-// DEFINE:    -buffer-deallocation-pipeline="private-function-dynamic-ownership" \
-// DEFINE:    -cse -canonicalize -test-lower-to-llvm -o %t
+// DEFINE:    -test-lower-to-llvm -o %t
 // DEFINE: %{entry_point} = main
 // DEFINE: %{run} = mlir-cpu-runner %t -e %{entry_point} -entry-point-result=void \
 // DEFINE:    -shared-libs=%mlir_runner_utils,%mlir_c_runner_utils
@@ -84,12 +82,30 @@ func.func private @pack(%A: tensor<7x16xi32>) {
 }
 
 module @transforms attributes { transform.with_named_sequence } {
-  transform.named_sequence @__transform_main(%module: !transform.any_op {transform.readonly}) {
+  transform.named_sequence @__transform_main(%module: !transform.any_op {transform.consume}) {
     %pack = transform.structured.match ops{["tensor.pack"]} in %module : (!transform.any_op) -> !transform.any_op
 
-    %tiled_linalg_op_p, %loops:2 = transform.structured.tile_using_for %pack tile_sizes [1, 1]
+    // 1. Tile so that we can decompose tensor.pack into tensor.pad,
+    // linalg.transpose, etc (see step 2).
+    %tiled_pack_op_p, %loops:2 = transform.structured.tile_using_for %pack tile_sizes [1, 1]
        : (!transform.any_op) -> (!transform.any_op, !transform.any_op, !transform.any_op)
 
+    // 2. Decompose the tiled Op into tensor.pad etc
+    %func_1 = transform.get_parent_op %tiled_pack_op_p {isolated_from_above} : (!transform.any_op) -> !transform.any_op
+    transform.apply_patterns to %func_1 {
+      transform.apply_patterns.linalg.decompose_pack_unpack
+    } : !transform.any_op
+
+    // 3. Bufferize before lowering to LLVM
+    %bufferize = transform.bufferization.one_shot_bufferize %module
+      {bufferize_function_boundaries=true} : (!transform.any_op) -> !transform.any_op
+
+    // 4. Canonicalize
+    %func_2 = transform.structured.match ops{["func.func"]} in %bufferize : (!transform.any_op) -> !transform.op<"func.func">
+    transform.apply_patterns to %func_2 {
+      transform.apply_patterns.canonicalization
+    } : !transform.op<"func.func">
+
     transform.yield
   }
 }

%pack = transform.structured.match ops{["tensor.pack"]} in %module : (!transform.any_op) -> !transform.any_op

%tiled_linalg_op_p, %loops:2 = transform.structured.tile_using_for %pack tile_sizes [1, 1]
// 1. Tile so that we can decompose tensor.pack into tensor.pad,
// linalg.transpose, etc (see step 2).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got this from LinalgTransformOps.td which answered my doubt.
Rewrite a tensor.pack into tensor.pad + tensor.expand_shape + linalg.transpose.

This removes need for ambigious inalg.transpose, etc (see step 2).

I dont know if its too much to ask and they may be test example elsewhere , but what does

%A_pack = tensor.pack %A
    padding_value(%pad_val : i32)
    inner_dims_pos = [0, 1]
    inner_tiles = [%tile_size, 1]
    into %A_pack_empty : tensor<7x16xi32> -> tensor<?x16x?x1xi32>
  %A_cast = tensor.cast %A_pack : tensor<?x16x?x1xi32> to tensor<*xi32>

is expected to become ... even in scant mode as comment would be super.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid request - thank you for bringing it up, and please don’t hesitate to ask in the future. I've been working on this for so long that I may have lost perspective on what's obvious versus what could use more explanation.

I've added some additional comments to clarify what’s happening here. I’ve intentionally skipped some finer details to keep the explanation focused and easier to follow.

Copy link
Contributor

@javedabsar1 javedabsar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks.

@banach-space banach-space merged commit d7d6fb1 into llvm:main Nov 21, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants