[mlir][vector] Fix dominance error in warp vector distribution #77771

matthias-springer · 2024-01-11T13:57:32Z

This commit fixes a test in vector-warp-distribute.mlir when MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS is enabled.

within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #0 does not dominate this use
    %1 = vector.extract %0[9] : f32 from vector<64xf32>
         ^
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region)
"func.func"() <{function_type = (index) -> f32, sym_name = "vector_extract_1d"}> ({
^bb0(%arg0: index):
  %0:2 = "vector.warp_execute_on_lane_0"(%arg0) <{warp_size = 32 : i64}> ({
    %7 = "some_def"() : () -> vector<64xf32>
    %8 = "arith.constant"() <{value = 9 : index}> : () -> index
    %9 = "vector.extractelement"(%7, %8) : (vector<64xf32>, index) -> f32
    "vector.yield"(%9, %7) : (f32, vector<64xf32>) -> ()
  }) : (index) -> (f32, vector<2xf32>)
  %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index
  %2 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 mod 2)>}> : (index) -> index
  %3 = "vector.extractelement"(%0#1, %2) : (vector<2xf32>, index) -> f32
  %4 = "arith.index_cast"(%1) : (index) -> i32
  %5 = "arith.constant"() <{value = 32 : i32}> : () -> i32
  %6:2 = "gpu.shuffle"(%3, %4, %5) <{mode = #gpu<shuffle_mode idx>}> : (f32, i32, i32) -> (f32, i1)
  "func.return"(%6#0) : (f32) -> ()
}) : () -> ()
LLVM ERROR: IR failed to verify after pattern application

The position at which vector.extractelement extracts must also be distributed. The fix in WarpOpExtractElement is similar to WarpOpInsertElement.

This commit fixes `vector-warp-distribute.mlir` when `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled. ``` within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #0 does not dominate this use %1 = vector.extract %0[9] : f32 from vector<64xf32> ^ within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region) "func.func"() <{function_type = (index) -> f32, sym_name = "vector_extract_1d"}> ({ ^bb0(%arg0: index): %0:2 = "vector.warp_execute_on_lane_0"(%arg0) <{warp_size = 32 : i64}> ({ %7 = "some_def"() : () -> vector<64xf32> %8 = "arith.constant"() <{value = 9 : index}> : () -> index %9 = "vector.extractelement"(%7, %8) : (vector<64xf32>, index) -> f32 "vector.yield"(%9, %7) : (f32, vector<64xf32>) -> () }) : (index) -> (f32, vector<2xf32>) %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index %2 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 mod 2)>}> : (index) -> index %3 = "vector.extractelement"(%0#1, %2) : (vector<2xf32>, index) -> f32 %4 = "arith.index_cast"(%1) : (index) -> i32 %5 = "arith.constant"() <{value = 32 : i32}> : () -> i32 %6:2 = "gpu.shuffle"(%3, %4, %5) <{mode = #gpu<shuffle_mode idx>}> : (f32, i32, i32) -> (f32, i1) "func.return"(%6#0) : (f32) -> () }) : () -> () LLVM ERROR: IR failed to verify after pattern application ```

llvmbot · 2024-01-11T13:57:59Z

@llvm/pr-subscribers-mlir

Author: Matthias Springer (matthias-springer)

Changes

This commit fixes a test in vector-warp-distribute.mlir when MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS is enabled.

within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #<!-- -->0 does not dominate this use
    %1 = vector.extract %0[9] : f32 from vector&lt;64xf32&gt;
         ^
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) &lt;{map = affine_map&lt;()[s0] -&gt; (s0 ceildiv 2)&gt;}&gt; : (index) -&gt; index
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region)
"func.func"() &lt;{function_type = (index) -&gt; f32, sym_name = "vector_extract_1d"}&gt; ({
^bb0(%arg0: index):
  %0:2 = "vector.warp_execute_on_lane_0"(%arg0) &lt;{warp_size = 32 : i64}&gt; ({
    %7 = "some_def"() : () -&gt; vector&lt;64xf32&gt;
    %8 = "arith.constant"() &lt;{value = 9 : index}&gt; : () -&gt; index
    %9 = "vector.extractelement"(%7, %8) : (vector&lt;64xf32&gt;, index) -&gt; f32
    "vector.yield"(%9, %7) : (f32, vector&lt;64xf32&gt;) -&gt; ()
  }) : (index) -&gt; (f32, vector&lt;2xf32&gt;)
  %1 = "affine.apply"(%8) &lt;{map = affine_map&lt;()[s0] -&gt; (s0 ceildiv 2)&gt;}&gt; : (index) -&gt; index
  %2 = "affine.apply"(%8) &lt;{map = affine_map&lt;()[s0] -&gt; (s0 mod 2)&gt;}&gt; : (index) -&gt; index
  %3 = "vector.extractelement"(%0#<!-- -->1, %2) : (vector&lt;2xf32&gt;, index) -&gt; f32
  %4 = "arith.index_cast"(%1) : (index) -&gt; i32
  %5 = "arith.constant"() &lt;{value = 32 : i32}&gt; : () -&gt; i32
  %6:2 = "gpu.shuffle"(%3, %4, %5) &lt;{mode = #gpu&lt;shuffle_mode idx&gt;}&gt; : (f32, i32, i32) -&gt; (f32, i1)
  "func.return"(%6#<!-- -->0) : (f32) -&gt; ()
}) : () -&gt; ()
LLVM ERROR: IR failed to verify after pattern application

Full diff: https://github.com/llvm/llvm-project/pull/77771.diff

1 Files Affected:

(modified) mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp (+13-5)

diff --git a/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp b/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
index 074356ab425377..ec6f1dea2f5454 100644
--- a/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
@@ -1329,11 +1329,17 @@ struct WarpOpExtractElement : public OpRewritePattern<WarpExecuteOnLane0Op> {
     } else {
       distributedVecType = extractSrcType;
     }
-    // Yield source vector from warp op.
+    // Yield source vector and position (if present) from warp op.
+    SmallVector<Value> additionalResults{extractOp.getVector()};
+    SmallVector<Type> additionalResultTypes{distributedVecType};
+    if (static_cast<bool>(extractOp.getPosition())) {
+      additionalResults.push_back(extractOp.getPosition());
+      additionalResultTypes.push_back(extractOp.getPosition().getType());
+    }
     Location loc = extractOp.getLoc();
     SmallVector<size_t> newRetIndices;
     WarpExecuteOnLane0Op newWarpOp = moveRegionToNewWarpOpAndAppendReturns(
-        rewriter, warpOp, {extractOp.getVector()}, {distributedVecType},
+        rewriter, warpOp, additionalResults, additionalResultTypes,
         newRetIndices);
     rewriter.setInsertionPointAfter(newWarpOp);
     Value distributedVec = newWarpOp->getResult(newRetIndices[0]);
@@ -1362,14 +1368,16 @@ struct WarpOpExtractElement : public OpRewritePattern<WarpExecuteOnLane0Op> {
     AffineExpr sym0 = getAffineSymbolExpr(0, rewriter.getContext());
     // tid of extracting thread: pos / elementsPerLane
     Value broadcastFromTid = rewriter.create<affine::AffineApplyOp>(
-        loc, sym0.ceilDiv(elementsPerLane), extractOp.getPosition());
+        loc, sym0.ceilDiv(elementsPerLane),
+        newWarpOp->getResult(newRetIndices[1]));
     // Extract at position: pos % elementsPerLane
     Value pos =
         elementsPerLane == 1
             ? rewriter.create<arith::ConstantIndexOp>(loc, 0).getResult()
             : rewriter
-                  .create<affine::AffineApplyOp>(loc, sym0 % elementsPerLane,
-                                                 extractOp.getPosition())
+                  .create<affine::AffineApplyOp>(
+                      loc, sym0 % elementsPerLane,
+                      newWarpOp->getResult(newRetIndices[1]))
                   .getResult();
     Value extracted =
         rewriter.create<vector::ExtractElementOp>(loc, distributedVec, pos);

llvmbot · 2024-01-11T13:57:59Z

@llvm/pr-subscribers-mlir-vector

Author: Matthias Springer (matthias-springer)

Changes

This commit fixes a test in vector-warp-distribute.mlir when MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS is enabled.

within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #<!-- -->0 does not dominate this use
    %1 = vector.extract %0[9] : f32 from vector&lt;64xf32&gt;
         ^
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) &lt;{map = affine_map&lt;()[s0] -&gt; (s0 ceildiv 2)&gt;}&gt; : (index) -&gt; index
within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region)
"func.func"() &lt;{function_type = (index) -&gt; f32, sym_name = "vector_extract_1d"}&gt; ({
^bb0(%arg0: index):
  %0:2 = "vector.warp_execute_on_lane_0"(%arg0) &lt;{warp_size = 32 : i64}&gt; ({
    %7 = "some_def"() : () -&gt; vector&lt;64xf32&gt;
    %8 = "arith.constant"() &lt;{value = 9 : index}&gt; : () -&gt; index
    %9 = "vector.extractelement"(%7, %8) : (vector&lt;64xf32&gt;, index) -&gt; f32
    "vector.yield"(%9, %7) : (f32, vector&lt;64xf32&gt;) -&gt; ()
  }) : (index) -&gt; (f32, vector&lt;2xf32&gt;)
  %1 = "affine.apply"(%8) &lt;{map = affine_map&lt;()[s0] -&gt; (s0 ceildiv 2)&gt;}&gt; : (index) -&gt; index
  %2 = "affine.apply"(%8) &lt;{map = affine_map&lt;()[s0] -&gt; (s0 mod 2)&gt;}&gt; : (index) -&gt; index
  %3 = "vector.extractelement"(%0#<!-- -->1, %2) : (vector&lt;2xf32&gt;, index) -&gt; f32
  %4 = "arith.index_cast"(%1) : (index) -&gt; i32
  %5 = "arith.constant"() &lt;{value = 32 : i32}&gt; : () -&gt; i32
  %6:2 = "gpu.shuffle"(%3, %4, %5) &lt;{mode = #gpu&lt;shuffle_mode idx&gt;}&gt; : (f32, i32, i32) -&gt; (f32, i1)
  "func.return"(%6#<!-- -->0) : (f32) -&gt; ()
}) : () -&gt; ()
LLVM ERROR: IR failed to verify after pattern application

Full diff: https://github.com/llvm/llvm-project/pull/77771.diff

1 Files Affected:

(modified) mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp (+13-5)

diff --git a/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp b/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
index 074356ab425377..ec6f1dea2f5454 100644
--- a/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/VectorDistribute.cpp
@@ -1329,11 +1329,17 @@ struct WarpOpExtractElement : public OpRewritePattern<WarpExecuteOnLane0Op> {
     } else {
       distributedVecType = extractSrcType;
     }
-    // Yield source vector from warp op.
+    // Yield source vector and position (if present) from warp op.
+    SmallVector<Value> additionalResults{extractOp.getVector()};
+    SmallVector<Type> additionalResultTypes{distributedVecType};
+    if (static_cast<bool>(extractOp.getPosition())) {
+      additionalResults.push_back(extractOp.getPosition());
+      additionalResultTypes.push_back(extractOp.getPosition().getType());
+    }
     Location loc = extractOp.getLoc();
     SmallVector<size_t> newRetIndices;
     WarpExecuteOnLane0Op newWarpOp = moveRegionToNewWarpOpAndAppendReturns(
-        rewriter, warpOp, {extractOp.getVector()}, {distributedVecType},
+        rewriter, warpOp, additionalResults, additionalResultTypes,
         newRetIndices);
     rewriter.setInsertionPointAfter(newWarpOp);
     Value distributedVec = newWarpOp->getResult(newRetIndices[0]);
@@ -1362,14 +1368,16 @@ struct WarpOpExtractElement : public OpRewritePattern<WarpExecuteOnLane0Op> {
     AffineExpr sym0 = getAffineSymbolExpr(0, rewriter.getContext());
     // tid of extracting thread: pos / elementsPerLane
     Value broadcastFromTid = rewriter.create<affine::AffineApplyOp>(
-        loc, sym0.ceilDiv(elementsPerLane), extractOp.getPosition());
+        loc, sym0.ceilDiv(elementsPerLane),
+        newWarpOp->getResult(newRetIndices[1]));
     // Extract at position: pos % elementsPerLane
     Value pos =
         elementsPerLane == 1
             ? rewriter.create<arith::ConstantIndexOp>(loc, 0).getResult()
             : rewriter
-                  .create<affine::AffineApplyOp>(loc, sym0 % elementsPerLane,
-                                                 extractOp.getPosition())
+                  .create<affine::AffineApplyOp>(
+                      loc, sym0 % elementsPerLane,
+                      newWarpOp->getResult(newRetIndices[1]))
                   .getResult();
     Value extracted =
         rewriter.create<vector::ExtractElementOp>(loc, distributedVec, pos);

…77771) This commit fixes a test in `vector-warp-distribute.mlir` when `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled. ``` within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #0 does not dominate this use %1 = vector.extract %0[9] : f32 from vector<64xf32> ^ within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region) "func.func"() <{function_type = (index) -> f32, sym_name = "vector_extract_1d"}> ({ ^bb0(%arg0: index): %0:2 = "vector.warp_execute_on_lane_0"(%arg0) <{warp_size = 32 : i64}> ({ %7 = "some_def"() : () -> vector<64xf32> %8 = "arith.constant"() <{value = 9 : index}> : () -> index %9 = "vector.extractelement"(%7, %8) : (vector<64xf32>, index) -> f32 "vector.yield"(%9, %7) : (f32, vector<64xf32>) -> () }) : (index) -> (f32, vector<2xf32>) %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index %2 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 mod 2)>}> : (index) -> index %3 = "vector.extractelement"(%0#1, %2) : (vector<2xf32>, index) -> f32 %4 = "arith.index_cast"(%1) : (index) -> i32 %5 = "arith.constant"() <{value = 32 : i32}> : () -> i32 %6:2 = "gpu.shuffle"(%3, %4, %5) <{mode = #gpu<shuffle_mode idx>}> : (f32, i32, i32) -> (f32, i1) "func.return"(%6#0) : (f32) -> () }) : () -> () LLVM ERROR: IR failed to verify after pattern application ``` The position at which `vector.extractelement` extracts must also be distributed. The fix in `WarpOpExtractElement` is similar to `WarpOpInsertElement`.

matthias-springer requested a review from ThomasRaoux January 11, 2024 13:57

matthias-springer requested review from hanhanW and nicolasvasilache as code owners January 11, 2024 13:57

llvmbot added mlir:vectorops mlir mlir:vector labels Jan 11, 2024

matthias-springer requested a review from jpienaar January 11, 2024 15:01

nicolasvasilache approved these changes Jan 12, 2024

View reviewed changes

matthias-springer merged commit ad100b3 into llvm:main Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][vector] Fix dominance error in warp vector distribution #77771

[mlir][vector] Fix dominance error in warp vector distribution #77771

Uh oh!

matthias-springer commented Jan 11, 2024 •

edited

Loading

Uh oh!

llvmbot commented Jan 11, 2024

Uh oh!

llvmbot commented Jan 11, 2024

Uh oh!

Uh oh!

[mlir][vector] Fix dominance error in warp vector distribution #77771

[mlir][vector] Fix dominance error in warp vector distribution #77771

Uh oh!

Conversation

matthias-springer commented Jan 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 11, 2024

Uh oh!

llvmbot commented Jan 11, 2024

Uh oh!

Uh oh!

matthias-springer commented Jan 11, 2024 •

edited

Loading