[mlir][scf] Considering defining operators of indices when fusing scf::ParallelOp #80145

Hsiangkai · 2024-01-31T14:31:40Z

When checking the load indices of the second loop coincide with the store indices of the first loop, it only considers the index values are the same or not. However, there are some cases the index values defined by other operators. In these cases, it will treat them as different even the results of defining operators are the same.

We already check if the iteration space is the same in isFusionLegal(). When checking operands of defining operators, we only need to consider the operands come from the same induction variables. If so, we know the results of defining operators are the same.

llvmbot · 2024-01-31T14:32:10Z

@llvm/pr-subscribers-mlir-scf

@llvm/pr-subscribers-mlir

Author: Hsiangkai Wang (Hsiangkai)

Changes

When checking the load indices of the second loop coincide with the store indices of the first loop, it only considers the index values are the same or not. However, there are some cases the index values come from affine.apply operator. In these cases, it will treat them as different even the affine map is the same and the affine operands are the same.

We already check if the iteration space is the same in isFusionLegal(). When checking affine.apply, we only need to consider the operands come from the same induction variables. If so, we know the results of affine.apply are the same.

Full diff: https://github.com/llvm/llvm-project/pull/80145.diff

2 Files Affected:

(modified) mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp (+31-2)
(modified) mlir/test/Dialect/SCF/parallel-loop-fusion.mlir (+46)

diff --git a/mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp b/mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp
index 8f2ab5f5e6dc1..0b7904b75e6d1 100644
--- a/mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp
+++ b/mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp
@@ -10,6 +10,7 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "mlir/Dialect/Affine/IR/AffineOps.h"
 #include "mlir/Dialect/SCF/Transforms/Passes.h"
 
 #include "mlir/Analysis/AliasAnalysis.h"
@@ -27,6 +28,7 @@ namespace mlir {
 } // namespace mlir
 
 using namespace mlir;
+using namespace mlir::affine;
 using namespace mlir::scf;
 
 /// Verify there are no nested ParallelOps.
@@ -54,6 +56,16 @@ static bool equalIterationSpaces(ParallelOp firstPloop,
          matchOperands(firstPloop.getStep(), secondPloop.getStep());
 }
 
+static int getInductionVarIndex(Value operand, ParallelOp loop) {
+  auto indVars = loop.getInductionVars();
+  auto it = std::find(indVars.begin(), indVars.end(), operand);
+
+  if (it != indVars.end())
+    return static_cast<int>(std::distance(indVars.begin(), it));
+
+  return -1;
+}
+
 /// Checks if the parallel loops have mixed access to the same buffers. Returns
 /// `true` if the first parallel loop writes to the same indices that the second
 /// loop reads.
@@ -102,8 +114,25 @@ static bool haveNoReadsAfterWriteExceptSameIndex(
       return WalkResult::interrupt();
     for (int i = 0, e = storeIndices.size(); i < e; ++i) {
       if (firstToSecondPloopIndices.lookupOrDefault(storeIndices[i]) !=
-          loadIndices[i])
-        return WalkResult::interrupt();
+          loadIndices[i]) {
+        auto storeIndexDef = storeIndices[i].getDefiningOp<AffineApplyOp>();
+        auto loadIndexDef = loadIndices[i].getDefiningOp<AffineApplyOp>();
+        if (storeIndexDef && loadIndexDef) {
+          // When two indices come from affine.apply, we check the results of
+          // these two affine.apply are the same or not.
+          if (storeIndexDef.getAffineMap() != loadIndexDef.getAffineMap())
+            return WalkResult::interrupt();
+          if (storeIndexDef.getNumOperands() != loadIndexDef.getNumOperands())
+            return WalkResult::interrupt();
+          for (unsigned i = 0; i < storeIndexDef.getNumOperands(); ++i) {
+            if (getInductionVarIndex(storeIndexDef.getOperand(i), firstPloop) !=
+                getInductionVarIndex(loadIndexDef.getOperand(i), secondPloop))
+              return WalkResult::interrupt();
+          }
+        } else {
+          return WalkResult::interrupt();
+        }
+      }
     }
     return WalkResult::advance();
   });
diff --git a/mlir/test/Dialect/SCF/parallel-loop-fusion.mlir b/mlir/test/Dialect/SCF/parallel-loop-fusion.mlir
index 110168ba6eca5..49cce745a8ce7 100644
--- a/mlir/test/Dialect/SCF/parallel-loop-fusion.mlir
+++ b/mlir/test/Dialect/SCF/parallel-loop-fusion.mlir
@@ -480,3 +480,49 @@ func.func @do_not_fuse_multiple_stores_on_diff_indices(
 // CHECK:        scf.reduce
 // CHECK:      }
 // CHECK:      memref.dealloc [[SUM]]
+
+// -----
+
+func.func @fuse_same_indices_by_affine_apply(
+  %A: memref<2x2xf32>, %B: memref<2x2xf32>) {
+  %c0 = arith.constant 0 : index
+  %c1 = arith.constant 1 : index
+  %c2 = arith.constant 2 : index
+  %sum = memref.alloc()  : memref<2x3xf32>
+  scf.parallel (%i, %j) = (%c0, %c0) to (%c2, %c2) step (%c1, %c1) {
+    %B_elem = memref.load %B[%i, %j] : memref<2x2xf32>
+    %1 = affine.apply affine_map<(d0, d1) -> (d0 + d1)>(%i, %j)
+    memref.store %B_elem, %sum[%i, %1] : memref<2x3xf32>
+    scf.reduce
+  }
+  scf.parallel (%i, %j) = (%c0, %c0) to (%c2, %c2) step (%c1, %c1) {
+    %1 = affine.apply affine_map<(d0, d1) -> (d0 + d1)>(%i, %j)
+    %sum_elem = memref.load %sum[%i, %1] : memref<2x3xf32>
+    %A_elem = memref.load %A[%i, %j] : memref<2x2xf32>
+    %product = arith.mulf %sum_elem, %A_elem : f32
+    memref.store %product, %B[%i, %j] : memref<2x2xf32>
+    scf.reduce
+  }
+  memref.dealloc %sum : memref<2x3xf32>
+  return
+}
+// CHECK:      #[[$MAP:.+]] = affine_map<(d0, d1) -> (d0 + d1)>
+// CHECK-LABEL: fuse_same_indices_by_affine_apply
+// CHECK-SAME:  (%[[ARG0:.+]]: memref<2x2xf32>, %[[ARG1:.+]]: memref<2x2xf32>) {
+// CHECK-DAG:   %[[C0:.+]] = arith.constant 0 : index
+// CHECK-DAG:   %[[C1:.+]] = arith.constant 1 : index
+// CHECK-DAG:   %[[C2:.+]] = arith.constant 2 : index
+// CHECK:       %[[ALLOC:.+]] = memref.alloc() : memref<2x3xf32>
+// CHECK-NEXT:  scf.parallel (%[[ARG2:.+]], %[[ARG3:.+]]) = (%[[C0]], %[[C0]]) to (%[[C2]], %[[C2]]) step (%[[C1]], %[[C1]]) {
+// CHECK-NEXT:    %[[S0:.+]] = memref.load %[[ARG1]][%[[ARG2]], %[[ARG3]]] : memref<2x2xf32>
+// CHECK-NEXT:    %[[S1:.+]] = affine.apply #[[$MAP]](%[[ARG2]], %[[ARG3]])
+// CHECK-NEXT:    memref.store %[[S0]], %[[ALLOC]][%[[ARG2]], %[[S1]]] : memref<2x3xf32>
+// CHECK-NEXT:    %[[S2:.+]] = affine.apply #[[$MAP]](%[[ARG2]], %[[ARG3]])
+// CHECK-NEXT:    %[[S3:.+]] = memref.load %[[ALLOC]][%[[ARG2]], %[[S2]]] : memref<2x3xf32>
+// CHECK-NEXT:    %[[S4:.+]] = memref.load %[[ARG0]][%[[ARG2]], %[[ARG3]]] : memref<2x2xf32>
+// CHECK-NEXT:    %[[S5:.+]] = arith.mulf %[[S3]], %[[S4]] : f32
+// CHECK-NEXT:    memref.store %[[S5]], %[[ARG1]][%[[ARG2]], %[[ARG3]]] : memref<2x2xf32>
+// CHECK-NEXT:    scf.reduce
+// CHECK-NEXT:  }
+// CHECK-NEXT:  memref.dealloc %[[ALLOC]] : memref<2x3xf32>
+// CHECK-NEXT:  return

mlir/test/Dialect/SCF/parallel-loop-fusion.mlir

mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp

When checking the load indices of the second loop coincide with the store indices of the first loop, it only considers the index values are the same or not. However, there are some cases the index values come from affine.apply operator. In these cases, it will treat them as different even the affine map is the same and the affine operands are the same. We already check if the iteration space is the same in isFusionLegal(). When checking affine.apply, we only need to consider the operands come from the same induction variables. If so, we know the results of affine.apply are the same.

Hardcode84

LGTM, thanks.

Please, update PR title and description before merging as it not limited to affine.apply now.

…::ParallelOp (llvm#80145) When checking the load indices of the second loop coincide with the store indices of the first loop, it only considers the index values are the same or not. However, there are some cases the index values defined by other operators. In these cases, it will treat them as different even the results of defining operators are the same. We already check if the iteration space is the same in isFusionLegal(). When checking operands of defining operators, we only need to consider the operands come from the same induction variables. If so, we know the results of defining operators are the same.

Hsiangkai requested a review from Hardcode84 January 31, 2024 14:31

llvmbot added mlir mlir:scf labels Jan 31, 2024

Hardcode84 reviewed Jan 31, 2024

View reviewed changes

mlir/test/Dialect/SCF/parallel-loop-fusion.mlir Outdated Show resolved Hide resolved

mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp Outdated Show resolved Hide resolved

Hsiangkai force-pushed the fuse-parallel branch from 8127a75 to c3fa46d Compare January 31, 2024 23:42

Hardcode84 reviewed Jan 31, 2024

View reviewed changes

mlir/lib/Dialect/SCF/Transforms/ParallelLoopFusion.cpp Outdated Show resolved Hide resolved

Hsiangkai force-pushed the fuse-parallel branch from c3fa46d to bb65dfc Compare February 1, 2024 10:52

Hardcode84 approved these changes Feb 1, 2024

View reviewed changes

Hsiangkai changed the title ~~[mlir][scf] Considering affine.apply when fusing scf::ParallelOp~~ [mlir][scf] Considering defining operators of indices when fusing scf::ParallelOp Feb 1, 2024

Hsiangkai merged commit c3eb297 into llvm:main Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][scf] Considering defining operators of indices when fusing scf::ParallelOp #80145

[mlir][scf] Considering defining operators of indices when fusing scf::ParallelOp #80145

Uh oh!

Hsiangkai commented Jan 31, 2024 •

edited

Loading

Uh oh!

llvmbot commented Jan 31, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hardcode84 left a comment

Uh oh!

Uh oh!

[mlir][scf] Considering defining operators of indices when fusing scf::ParallelOp #80145

[mlir][scf] Considering defining operators of indices when fusing scf::ParallelOp #80145

Uh oh!

Conversation

Hsiangkai commented Jan 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Hardcode84 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Hsiangkai commented Jan 31, 2024 •

edited

Loading

llvmbot commented Jan 31, 2024 •

edited

Loading