[mlir][nfc] Update vectorize-tensor-extract.mlir (3/N) #119121

banach-space · 2024-12-08T09:13:19Z

Tests in "vectorize-tensor-extract.mlir" are inconsistent and would
benefit from refactoring to:

Clearly categorize tests into "contiguous load," "gather load," and
"scalar load + broadcast" cases, reflecting the structure of
tensor.extract vectorization.
Unify variable naming (both MLIR and FileCheck).
Ensure all tests exercise unmasked vectorization (masked vectorization
is covered in "vectorize-tensor-extract-masked.mlir").
Improve and standardize formatting.

These changes will make it easier to identify the test cases being
exercised and simplify future maintenance or refactoring.

This is patch 3/N in the series. Below is a summary of the changes in
this patch.

Summary of patch 3/N

Cluster all tests for "scalar load + broadcast" together
Unify MLIR and FileCheck variable names, e.g. %input, %output ->
%src, %init.

Note, I haven't changed test function names to make it easier to track
changes (this PR is mostly about moving code). I will send a seperate PR
to rename the tests.

Previous patches

llvmbot · 2024-12-08T09:13:52Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: Andrzej Warzyński (banach-space)

Changes

[mlir][linalg] Enable Vectorization of 0-D tensor.extract
[mlir][nfc] Update vectorize-tensor-extract.mlir (1/N)
[mlir][nfc] Update vectorize-tensor-extract.mlir (2/N)
[mlir][nfc] Update vectorize-tensor-extract.mlir (3/N)

Patch is 36.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/119121.diff

4 Files Affected:

(modified) mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp (-5)
(added) mlir/test/Dialect/Linalg/td/vectorize-with-patterns.mlir (+10)
(modified) mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir (+45-3)
(modified) mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir (+155-305)

diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index e5c96b52acee23..863f2280e46ce6 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1115,11 +1115,6 @@ vectorizeTensorExtract(RewriterBase &rewriter, VectorizationState &state,
   //  b. contiguous loads.
   // Both cases use vector.transfer_read.
 
-  assert(llvm::count_if(resultType.getShape(),
-                        [](uint64_t dim) { return dim != 1; }) &&
-         "Contiguous loads and scalar loads + broadcast only support 1-D "
-         "vectors ATM!");
-
   // Collect indices for `vector.transfer_read`. At this point, the indices will
   // either be scalars or would have been broadcast to vectors matching the
   // result type. For indices that are vectors, there are two options:
diff --git a/mlir/test/Dialect/Linalg/td/vectorize-with-patterns.mlir b/mlir/test/Dialect/Linalg/td/vectorize-with-patterns.mlir
new file mode 100644
index 00000000000000..f8d1a50d7430df
--- /dev/null
+++ b/mlir/test/Dialect/Linalg/td/vectorize-with-patterns.mlir
@@ -0,0 +1,10 @@
+module @transforms attributes { transform.with_named_sequence } {
+  transform.named_sequence @vectorize_with_patterns(%module: !transform.any_op {transform.readonly}) {
+
+    %0 = transform.structured.match ops{["linalg.generic"]} in %module : (!transform.any_op) -> !transform.any_op
+    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
+    %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
+
+    transform.yield
+   }
+}
diff --git a/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir b/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
index 0c996bed996d3c..b688a677500c22 100644
--- a/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
+++ b/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
@@ -122,6 +122,48 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
+#map = affine_map<() -> ()>
+
+// CHECK-LABEL:   func.func @generic_0d(
+// CHECK-SAME:     %[[ARG_0:.*]]: tensor<f32>, %[[ARG_1:.*]]: tensor<f32>, %[[ARG_2:.*]]: tensor<f32>)
+func.func @generic_0d(%arg0: tensor<f32>, %arg1: tensor<f32>,
+                      %arg2: tensor<f32>) -> tensor<f32> {
+// CHECK:           %[[PAD:.*]] = arith.constant 0.000000e+00 : f32
+// CHECK:           %[[READ_0:.*]] = vector.transfer_read %[[ARG_0]][], %[[PAD]] : tensor<f32>, vector<f32>
+// CHECK:           %[[ARG_0_AS_SCALAR:.*]] = vector.extract %[[READ_0]][] : f32 from vector<f32>
+// CHECK:           %[[READ_1:.*]] = vector.transfer_read %[[ARG_1]][], %[[PAD]] : tensor<f32>, vector<f32>
+// CHECK:           %[[ARG_1_AS_SCALAR:.*]] = vector.extract %[[READ_1]][] : f32 from vector<f32>
+// CHECK:           %[[READ_2:.*]] = vector.transfer_read %[[ARG_2]][], %[[PAD]] : tensor<f32>, vector<f32>
+// CHECK:           %[[ARG_2_AS_SCALAR:.*]] = vector.extract %[[READ_2]][] : f32 from vector<f32>
+// CHECK:           %[[MULF:.*]] = arith.mulf %[[ARG_0_AS_SCALAR]], %[[ARG_1_AS_SCALAR]] : f32
+// CHECK:           %[[ADDF:.*]] = arith.addf %[[ARG_2_AS_SCALAR]], %[[MULF]] : f32
+// CHECK:           %[[ADDF_BCAST:.*]] = vector.broadcast %[[ADDF]] : f32 to vector<f32>
+// CHECK:           vector.transfer_write %[[ADDF_BCAST]], %[[ARG_2]][] : vector<f32>, tensor<f32>
+  %res = linalg.generic {
+    indexing_maps = [#map, #map, #map],
+    iterator_types = []
+  } ins(%arg0, %arg1 : tensor<f32>, tensor<f32>)
+    outs(%arg2 : tensor<f32>) {
+  ^bb(%a: f32, %b: f32, %c: f32) :
+    %d = arith.mulf %a, %b: f32
+    %e = arith.addf %c, %d: f32
+    linalg.yield %e : f32
+  } -> tensor<f32>
+
+  return %res : tensor<f32>
+}
+
+module attributes {transform.with_named_sequence} {
+  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
+    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
+    %2 = transform.structured.vectorize_children_and_apply_patterns %1  { disable_multi_reduction_to_contract_patterns, disable_transfer_permutation_map_lowering_patterns } : (!transform.any_op) -> !transform.any_op
+    transform.yield
+  }
+}
+
+// -----
+
 #matmul_transpose_out_trait = {
   indexing_maps = [
     affine_map<(m, n, k) -> (m, k)>,
@@ -372,7 +414,7 @@ module attributes {transform.with_named_sequence} {
 // -----
 
 // CHECK-LABEL: func @test_vectorize_fill
-func.func @test_vectorize_fill_scalar(%A : memref<f32>, %arg0 : f32) {
+func.func @test_vectorize_fill_0d(%A : memref<f32>, %arg0 : f32) {
   // CHECK-SAME: (%[[M:.*]]: memref<f32>, %[[val:.*]]: f32)
   //      CHECK:   %[[VEC:.*]] = vector.broadcast %[[val]] : f32 to vector<f32>
   //      CHECK:   vector.transfer_write %[[VEC]], %[[M]][] : vector<f32>, memref<f32>
@@ -410,8 +452,8 @@ module attributes {transform.with_named_sequence} {
 
 // -----
 
-// CHECK-LABEL: func @test_vectorize_copy_scalar
-func.func @test_vectorize_copy_scalar(%A : memref<f32>, %B : memref<f32>) {
+// CHECK-LABEL: func @test_vectorize_copy_0d
+func.func @test_vectorize_copy_0d(%A : memref<f32>, %B : memref<f32>) {
   //  CHECK-SAME: (%[[A:.*]]: memref<f32>, %[[B:.*]]: memref<f32>)
   //       CHECK:   %[[V:.*]] = vector.transfer_read %[[A]][]{{.*}} : memref<f32>, vector<f32>
   //       CHECK:   %[[val:.*]] = vector.extract %[[V]][] : f32 from vector<f32>
diff --git a/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir b/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
index 1a93d1cd9b7880..1491c6298691a2 100644
--- a/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
+++ b/mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
@@ -1,4 +1,6 @@
-// RUN: mlir-opt %s -transform-interpreter -split-input-file | FileCheck %s
+// RUN: mlir-opt -split-input-file \
+// RUN: -transform-preload-library='transform-library-paths=%p/td/vectorize-with-patterns.mlir' \
+// RUN: -transform-interpreter=entry-point=vectorize_with_patterns %s | FileCheck %s
 
 #map0 = affine_map<(d0, d1, d2, d3) -> (d0, d2)>
 #map1 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2, d3)>
@@ -27,79 +29,6 @@ func.func @vectorize_1d_tensor_extract(%arg0: tensor<3xf32>, %arg1: tensor<4x3xi
 // CHECK: %[[GATHER:.*]] = vector.gather %[[ARG0]][%[[C0]]] [%[[INDICES]]], %[[MASK]], %[[PASSTHRU]]
 // CHECK: vector.transfer_write %[[GATHER]]
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-    %2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
-    transform.yield
-  }
-}
-
-// -----
-
-#map = affine_map<() -> ()>
-func.func @negative_no_loops(%arg0: tensor<f32>, %arg1: tensor<f32>) -> tensor<f32> {
-  %1 = linalg.generic {
-    indexing_maps = [#map],
-    iterator_types = []
-  } outs(%arg1 : tensor<f32>) {
-  ^bb0(%arg4: f32):
-    %2 = tensor.extract %arg0[] : tensor<f32>
-    linalg.yield %2 : f32
-  } -> tensor<f32>
-  return %1 : tensor<f32>
-}
-// CHECK-LABEL: func.func @negative_no_loops
-// CHECK: tensor.extract
-
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-    %2 = transform.structured.vectorize_children_and_apply_patterns %1 : (!transform.any_op) -> !transform.any_op
-    transform.yield
-  }
-}
-
-
-// -----
-
-#map = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
-func.func @vectorize_nd_tensor_extract_scalar_broadcast(%arg0: tensor<3x3xf32>, %arg2: tensor<1x1x3xf32>) -> tensor<1x1x3xf32> {
-  %c0 = arith.constant 1 : index
-  %c1 = arith.constant 2 : index
-  %2 = linalg.generic {
-    indexing_maps = [#map],
-    iterator_types = ["parallel", "parallel", "parallel"]
-  } outs(%arg2 : tensor<1x1x3xf32>) {
-  ^bb0(%arg4: f32):
-    %7 = tensor.extract %arg0[%c0, %c1] : tensor<3x3xf32>
-    linalg.yield %7 : f32
-  } -> tensor<1x1x3xf32>
-  return %2 : tensor<1x1x3xf32>
-}
-
-// CHECK: #[[$MAP:.+]] = affine_map<(d0, d1) -> (0, 0, 0)>
-// CHECK-LABEL:   func.func @vectorize_nd_tensor_extract_scalar_broadcast(
-// CHECK-SAME:      %[[ARG_0:.*]]: tensor<3x3xf32>,
-// CHECK-SAME:      %[[ARG_1:.*]]: tensor<1x1x3xf32>) -> tensor<1x1x3xf32> {
-// CHECK-DAG:       %[[C1:.*]] = arith.constant 1 : index
-// CHECK-DAG:       %[[C2:.*]] = arith.constant 2 : index
-// CHECK-DAG:       %[[C0:.*]] = arith.constant 0 : index
-// CHECK:           %[[MASK:.*]] = vector.constant_mask [1] : vector<1xi1>
-// CHECK:           %[[READ:.*]] = vector.mask %[[MASK]] { vector.transfer_read %[[ARG_0]][%[[C1]], %[[C2]]], {{.*}} {in_bounds = [true, true, true], permutation_map = #[[$MAP]]} : tensor<3x3xf32>, vector<1x1x3xf32> } : vector<1xi1> -> vector<1x1x3xf32>
-// CHECK:           %[[C0_2:.*]] = arith.constant 0 : index
-// CHECK:           vector.transfer_write %[[READ]], %[[ARG_1]]{{\[}}%[[C0_2]], %[[C0_2]], %[[C0_2]]] : vector<1x1x3xf32>, tensor<1x1x3xf32>
-
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-     %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-    transform.structured.vectorize %0 { vectorize_nd_extract }  : !transform.any_op
-    transform.yield
-   }
-}
-
 // -----
 
 #map = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
@@ -138,46 +67,6 @@ func.func @vectorize_nd_tensor_extract_transfer_read_basic(
 // CHECK:   %[[READ:.*]] = vector.transfer_read %[[ARG0]][%[[IDX1]], %[[IDX2]], %[[IDX3]]], %[[CST]] {in_bounds = [true, true, true]} : tensor<3x3x3xf32>, vector<1x1x3xf32>
 // CHECK:   vector.transfer_write %[[READ]], %[[ARG1]][%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [true, true, true]} : vector<1x1x3xf32>, tensor<1x1x3xf32>
 
-// Same as example above, but reading into a column tensor.
-
-// TODO: Currently this fails to vectorise when the indices are non-constant.
-
-func.func @vectorize_nd_tensor_extract_transfer_read_basic_column(
-    %input: tensor<3x3x3xf32>,
-    %output: tensor<3x1x1xf32>) -> tensor<3x1x1xf32> {
-
-  %c0 = arith.constant 0 : index
-  %res = linalg.generic {
-    indexing_maps = [#map],
-    iterator_types = ["parallel", "parallel", "parallel"]
-  } outs(%output : tensor<3x1x1xf32>) {
-  ^bb0(%out: f32):
-    %5 = tensor.extract %input[%c0, %c0, %c0] : tensor<3x3x3xf32>
-    linalg.yield %5 : f32
-  } -> tensor<3x1x1xf32>
-
-  return %res : tensor<3x1x1xf32>
-}
-
-// CHECK-LABEL:   func.func @vectorize_nd_tensor_extract_transfer_read_basic_column(
-// CHECK-SAME:      %[[INPUT:.*]]: tensor<3x3x3xf32>,
-// CHECK-SAME:      %[[OUTPUT:.*]]: tensor<3x1x1xf32>)
-// CHECK-DAG:        %[[C0:.*]] = arith.constant 0 : index
-// CHECK-DAG:        %[[CST_0:.*]] = arith.constant 0.000000e+00 : f32
-// CHECK:           %[[READ:.*]] = vector.transfer_read %[[INPUT]]{{\[}}%[[C0]], %[[C0]], %[[C0]]], %[[CST_0]] : tensor<3x3x3xf32>, vector<f32>
-// CHECK:           %[[BCAST:.*]] = vector.broadcast %[[READ]] : vector<f32> to vector<3x1x1xf32>
-// CHECK:           %[[RES:.*]] = vector.transfer_write %[[BCAST]], %[[OUTPUT]]{{\[}}%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [true, true, true]} : vector<3x1x1xf32>, tensor<3x1x1xf32>
-// CHECK:           return %[[RES]] : tensor<3x1x1xf32>
-
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-    %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
-    transform.yield
-  }
-}
-
  // -----
 
 func.func @vectorize_nd_tensor_extract_transfer_read_complex(%6: tensor<45x80x16xf32>, %arg0: index, %arg2: index, %arg1: index, %arg4: index, %extracted_slice : tensor<1x4xf32>) -> tensor<1x4xf32> {
@@ -221,15 +110,6 @@ func.func @vectorize_nd_tensor_extract_transfer_read_complex(%6: tensor<45x80x16
 // CHECK:           return %[[VAL_21]] : tensor<1x4xf32>
 // CHECK:         }
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-     %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-     %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-     %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
-     transform.yield
-   }
-}
-
 // -----
 
 #map0 = affine_map<(d0, d1, d2, d3) -> (d0, d2)>
@@ -271,15 +151,6 @@ func.func @vectorize_nd_tensor_extract_index_from_tensor(%arg0: tensor<3x3xf32>,
 // CHECK:    %[[GATHER:.*]] = vector.gather %[[ARG0]][%[[C0]], %[[C0]]] [%[[T]]], %[[CST_1]], %[[PASSTHRU]] : tensor<3x3xf32>, vector<4x7x3x2xindex>, vector<4x7x3x2xi1>, vector<4x7x3x2xf32> into vector<4x7x3x2xf32>
 // CHECK:    vector.transfer_write %[[GATHER]], %[[ARG4]][%[[C0]], %[[C0]], %[[C0]], %[[C0]]] {in_bounds = [true, true, true, true]} : vector<4x7x3x2xf32>, tensor<4x7x3x2xf32>
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-    %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-    %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
-    transform.yield
-  }
-}
-
 // -----
 
 #map = affine_map<(d0, d1) -> (d0, d1)>
@@ -301,15 +172,6 @@ func.func @vectorize_nd_tensor_extract_load_1d_column_vector_using_gather_load(%
   return %1 : tensor<8x1xf32>
 }
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg0: !transform.any_op {transform.readonly}) {
-    %0 = transform.structured.match ops{["linalg.generic"]} in %arg0 : (!transform.any_op) -> !transform.any_op
-    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-    %2 = transform.structured.vectorize_children_and_apply_patterns %1 {vectorize_nd_extract} : (!transform.any_op) -> !transform.any_op
-    transform.yield
-  }
-}
-
 // CHECK-LABEL: func.func @vectorize_nd_tensor_extract_load_1d_column_vector_using_gather_load
 // CHECK-SAME: %[[ARG0:.*]]: tensor<8x128x768xf32>
 // CHECK-SAME: %[[ARG1:.*]]: index
@@ -352,15 +214,6 @@ func.func @index_from_output_column_vector_gather_load(%src: tensor<8x128xf32>)
   return %res : tensor<8x1xf32>
 }
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg2: !transform.any_op {transform.readonly}) {
-    %0 = transform.structured.match ops{["linalg.generic"]} in %arg2 : (!transform.any_op) -> !transform.any_op
-    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-    %2 = transform.structured.vectorize_children_and_apply_patterns %1 {vectorize_nd_extract} : (!transform.any_op) -> !transform.any_op
-    transform.yield
-  }
-}
-
 // CHECK-LABEL:   func.func @index_from_output_column_vector_gather_load(
 // CHECK-SAME:      %[[SRC:.*]]: tensor<8x128xf32>) -> tensor<8x1xf32> {
 // CHECK:           %[[C128:.*]] = arith.constant dense<128> : vector<1x8xindex>
@@ -399,15 +252,6 @@ func.func @index_from_output_column_vector_contiguous_load(%src: tensor<8x128xf3
   return %res : tensor<8x1xf32>
 }
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg2: !transform.any_op {transform.readonly}) {
-    %0 = transform.structured.match ops{["linalg.generic"]} in %arg2 : (!transform.any_op) -> !transform.any_op
-    %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-    %2 = transform.structured.vectorize_children_and_apply_patterns %1 {vectorize_nd_extract} : (!transform.any_op) -> !transform.any_op
-    transform.yield
-  }
-}
-
 // CHECK-LABEL:   func.func @index_from_output_column_vector_contiguous_load(
 // CHECK-SAME:      %[[SRC:.*]]: tensor<8x128xf32>) -> tensor<8x1xf32> {
 // CHECK:           %[[C0:.*]] = arith.constant 0 : index
@@ -459,15 +303,6 @@ func.func @vectorize_nd_tensor_extract_contiguous_and_gather(%arg0: tensor<6xf32
 // CHECK:           %[[VAL_14:.*]] = vector.transfer_write %[[VAL_13]], %[[VAL_8]]{{\[}}%[[VAL_2]]] {in_bounds = [true]} : vector<5xf32>, tensor<5xf32>
 // CHECK:           return %[[VAL_14]] : tensor<5xf32>
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-     %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-     %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-     %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
-     transform.yield
-   }
-}
-
 // -----
 
 // The vectorizer converts `affine.apply` so that the subsequent Ops can be vectorised based on the converted ops. Contiguous load.
@@ -502,15 +337,6 @@ func.func @vectorize_nd_tensor_extract_with_affine_apply_contiguous(%6: tensor<8
 // CHECK:           return %[[VAL_12]] : tensor<1x4xf32>
 // CHECK:         }
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-     %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-     %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-     %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
-     transform.yield
-   }
-}
-
 // -----
 
 func.func @vectorize_nd_tensor_extract_with_tensor_extract(%input_1: tensor<1x20xi32>, %input_2: tensor<257x24xf32>, %arg0 : index, %arg1 : index, %arg2 : index, %arg3 : index) -> tensor<1x1x4xf32> {
@@ -547,16 +373,6 @@ func.func @vectorize_nd_tensor_extract_with_tensor_extract(%input_1: tensor<1x20
 // for address calculation also satisfy the required conditions).
 // CHECK:           vector.transfer_read %[[INPUT_2]][%{{.*}}, %{{.*}}, %{{.*}} {in_bounds = [true, true]} : tensor<257x24xf32>, vector<1x4xf32>
 
-
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
-     %0 = transform.structured.match ops{["linalg.generic"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-     %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
-     %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_nd_extract } : (!transform.any_op) -> !transform.any_op
-     transform.yield
-   }
-}
-
 // -----
 
 // The vectorizer converts `affine.apply` so that the subsequent Ops can be vectorised based on the converted ops. Gather load.
@@ -594,15 +410,6 @@ func.func @vectorize_nd_tensor_extract_with_affine_apply_gather(%6: tensor<80x16
 // CHECK:           return %[[VAL_14]] : tensor<1x4xf32>
 // CHECK:         }
 
-module attributes {transform.with_named_sequence} {
-  transform.named_sequence @__transform_main(%arg1: !tr...
[truncated]

hanhanW · 2024-12-10T17:42:44Z

DEPENDS ON

[mlir][nfc] Update vectorize-tensor-extract.mlir (1/N) #118977

[mlir][linalg] Enable Vectorization of 0-D tensor.extract #119079

[mlir][nfc] Update vectorize-tensor-extract.mlir (1/N) #118977

The third PR link is as the same as the first one. I think it should be #119080, not a big deal though. Same question, given that you're touching the lit checks, could we use CHECK-DAG for constants?

banach-space · 2024-12-10T18:26:27Z

DEPENDS ON

[mlir][nfc] Update vectorize-tensor-extract.mlir (1/N) #118977

[mlir][linalg] Enable Vectorization of 0-D tensor.extract #119079

[mlir][nfc] Update vectorize-tensor-extract.mlir (1/N) #118977

The third PR link is as the same as the first one. I think it should be #119080, not a big deal though. Same question, given that you're touching the lit checks, could we use CHECK-DAG for constants?

Done: 1dcd71f 👍🏻

banach-space · 2024-12-10T18:26:54Z

Same question, given that you're touching the lit checks, could we use CHECK-DAG for constants?

Done: 1dcd71f 👍🏻

hanhanW

Thanks!

dcaballe · 2024-12-11T05:54:01Z

Thanks! I can see that you are putting a lot of effort in naming things properly, including the CHECK variable. I appreciate the effort. I wonder if it would make sense to implement a utility that could help with that and make it more efficient. Just a random thought. Thanks!

Tests in "vectorize-tensor-extract.mlir" are inconsistent and would benefit from refactoring to: * Clearly categorize tests into "contiguous load," "gather load," and "scalar load + broadcast" cases, reflecting the structure of tensor.extract vectorization. * Unify variable naming (both MLIR and FileCheck). * Ensure all tests exercise unmasked vectorization (masked vectorization is covered in "vectorize-tensor-extract-masked.mlir"). * Improve and standardize formatting. These changes will make it easier to identify the test cases being exercised and simplify future maintenance or refactoring. This is patch 3/N in the series. Below is a summary of the changes in this patch. ---------------------------------------------------------------------- Summary of patch 3/N ---------------------------------------------------------------------- * Cluster all tests for "scalar load + broadcast" together * Unify MLIR and FileCheck variable names, e.g. `%input`, `%output` -> `%src`, `%init`. Note, I haven't changed test function names to make it easier to track changes (this PR is mostly about moving code). I will send a seperate PR to rename the tests. ---------------------------------------------------------------------- **DEPENDS ON** * llvm#118977 * llvm#119079 * llvm#118977 Please only review the top commit

Add DAG to FileCHeck prefixes

banach-space · 2024-12-11T15:20:13Z

I can see that you are putting a lot of effort in naming things properly, including the CHECK variable. I appreciate the effort.

Thank you :)

I wonder if it would make sense to implement a utility that could help with that and make it more efficient.

We could extend generate-test-checks.py to generate "smarter" LIT variable names for function arguments, e.g.,

%src -> %[[SRC:.*]].

That said, I’m not sure how much beyond this could be effectively automated. In my view, writing good tests with descriptive variables requires thoughtful input from humans who understand:

the code being tested.
the purpose of the test itself.

From there, we ("humans") need to ensure consistent naming within each test file. Consistency is especially valuable when dealing with hundreds of CHECK lines.

In the case of the "vectorization of tensor.extract" tests, I’ll admit I didn’t do the best job (that was mostly me!), so I’m now trying to fix it - hopefully to the benefit of everyone! 😊

Tests in "vectorize-tensor-extract.mlir" are inconsistent and would benefit from refactoring to: * Clearly categorize tests into "contiguous load," "gather load," and "scalar load + broadcast" cases, reflecting the structure of tensor.extract vectorization. * Unify variable naming (both MLIR and FileCheck). * Ensure all tests exercise unmasked vectorization (masked vectorization is covered in "vectorize-tensor-extract-masked.mlir"). * Improve and standardize formatting. These changes will make it easier to identify the test cases being exercised and simplify future maintenance or refactoring. This is patch 3/N in the series. Below is a summary of the changes in this patch. ---------------------------------------------------------------------- Summary of patch 4/N ---------------------------------------------------------------------- * Cluster all tests for "contiguous load" and "gather load" together (in 2 seperate groups). Note that this is merely moving things around. ---------------------------------------------------------------------- Previous patches ---------------------------------------------------------------------- * llvm#118977 * llvm#119080 * llvm#119121

banach-space requested review from dcaballe, hanhanW, nicolasvasilache and Groverkss as code owners December 8, 2024 09:13

llvmbot added mlir:linalg mlir labels Dec 8, 2024

banach-space changed the title ~~andrzej/refact extract vec test 1~~ [mlir][nfc] Update vectorize-tensor-extract.mlir (3/N) Dec 8, 2024

banach-space force-pushed the andrzej/refact_extract_vec_test_1 branch from f000d76 to 1b053b5 Compare December 8, 2024 09:14

banach-space force-pushed the andrzej/refact_extract_vec_test_1 branch from 1b053b5 to 1dcd71f Compare December 10, 2024 18:25

hanhanW approved these changes Dec 11, 2024

View reviewed changes

dcaballe approved these changes Dec 11, 2024

View reviewed changes

banach-space closed this Dec 11, 2024

banach-space deleted the andrzej/refact_extract_vec_test_1 branch December 11, 2024 07:52

banach-space restored the andrzej/refact_extract_vec_test_1 branch December 11, 2024 08:33

banach-space reopened this Dec 11, 2024

banach-space added 2 commits December 11, 2024 08:39

fixup! [mlir][nfc] Update vectorize-tensor-extract.mlir (3/N)

9369a5d

Add DAG to FileCHeck prefixes

banach-space force-pushed the andrzej/refact_extract_vec_test_1 branch from 1dcd71f to 9369a5d Compare December 11, 2024 08:39

banach-space merged commit e0c6088 into llvm:main Dec 11, 2024
8 checks passed

banach-space mentioned this pull request Dec 12, 2024

[mlir][nfc] Update vectorize-tensor-extract.mlir (4/N) #119697

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][nfc] Update vectorize-tensor-extract.mlir (3/N) #119121

[mlir][nfc] Update vectorize-tensor-extract.mlir (3/N) #119121

Uh oh!

banach-space commented Dec 8, 2024 •

edited

Loading

Uh oh!

llvmbot commented Dec 8, 2024 •

edited

Loading

Uh oh!

hanhanW commented Dec 10, 2024

Uh oh!

banach-space commented Dec 10, 2024

Uh oh!

banach-space commented Dec 10, 2024

Uh oh!

hanhanW left a comment

Uh oh!

dcaballe commented Dec 11, 2024

Uh oh!

banach-space commented Dec 11, 2024

Uh oh!

Uh oh!

Uh oh!

[mlir][nfc] Update vectorize-tensor-extract.mlir (3/N) #119121

[mlir][nfc] Update vectorize-tensor-extract.mlir (3/N) #119121

Uh oh!

Conversation

banach-space commented Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of patch 3/N

Previous patches

Uh oh!

llvmbot commented Dec 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanhanW commented Dec 10, 2024

Uh oh!

banach-space commented Dec 10, 2024

Uh oh!

banach-space commented Dec 10, 2024

Uh oh!

hanhanW left a comment

Choose a reason for hiding this comment

Uh oh!

dcaballe commented Dec 11, 2024

Uh oh!

banach-space commented Dec 11, 2024

Uh oh!

Uh oh!

Uh oh!

banach-space commented Dec 8, 2024 •

edited

Loading

llvmbot commented Dec 8, 2024 •

edited

Loading