[mlir][ArmSME] Add rewrites to swap extract of extend #80407

c-rhodes · 2024-02-02T08:54:40Z

In mixed matmul lowering (e.g., i8 to i32) we're seeing the following sequence:

%0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
%1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32>
%lhs = vector.scalable.extract %1[0] : vector<[4]xi32> from vector<[8]xi32>

... (same for rhs)

%2 = vector.outerproduct %lhs, %rhs, %acc vector<[4]xi32>, vector<[4]xi32>

// x4 chained by accumulator

This chain of 4 outer products can be fused into a single 4-way widening variant but the pass doesn't match on the IR, as it expects the source of the inputs to be an extend and it can't look through the extracts.

This patch fixes this with two rewrites that swaps extract(extend) into extend(extract).

Related to #78975, #79288.

llvmbot · 2024-02-02T08:55:11Z

@llvm/pr-subscribers-mlir-sme

Author: Cullen Rhodes (c-rhodes)

Changes

In mixed matmul lowering (e.g., i8 to i32) we're seeing the following sequence:

%0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
%1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32>
%lhs = vector.scalable.extract %1[0] : vector<[4]xi32> from vector<[8]xi32>

... (same for rhs)

%2 = vector.outerproduct %lhs, %rhs, %acc vector<[4]xi32>, vector<[4]xi32>

// x4 chained by accumulator

This chain of 4 outer products can be fused into a single 4-way widening variant but the pass doesn't match on the IR, as it expects the source of the inputs to be an extend and it can't look through the extracts.

This patch fixes this with two rewrites that swaps extract(extend) into extend(extract).

Related to #78975, #79288.

Full diff: https://github.com/llvm/llvm-project/pull/80407.diff

2 Files Affected:

(modified) mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp (+101)
(modified) mlir/test/Dialect/ArmSME/vector-legalization.mlir (+40)

diff --git a/mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp b/mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp
index 85ec53c2618aa..40f5093c1404a 100644
--- a/mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp
+++ b/mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp
@@ -338,6 +338,105 @@ struct LegalizeTransferWriteOpsByDecomposition
   }
 };
 
+// Shuffles arith extend ops after vector.extract op.
+//
+// This transforms IR like:
+//   %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
+//   %1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32>
+// Into:
+//   %0 = vector.extract %src[0] : vector<[8]xi8> from vector<4x[8]xi8>
+//   %1 = arith.extsi %0 : vector<[8]xi8> to vector<[8]xi32>
+//
+// This enables outer product fusion in the `-arm-sme-outer-product-fusion`
+// pass when the result is the input to an outer product.
+struct SwapVectorExtractOfArithExtend
+    : public OpRewritePattern<vector::ExtractOp> {
+  using OpRewritePattern::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(vector::ExtractOp extractOp,
+                                PatternRewriter &rewriter) const override {
+    VectorType resultType = llvm::dyn_cast<VectorType>(extractOp.getType());
+    if (!resultType)
+      return rewriter.notifyMatchFailure(extractOp,
+                                         "extracted type is not a vector type");
+
+    auto numScalableDims = llvm::count(resultType.getScalableDims(), true);
+    if (numScalableDims != 1)
+      return rewriter.notifyMatchFailure(
+          extractOp, "extracted type is not a 1-D scalable vector type");
+
+    auto *extendOp = extractOp.getVector().getDefiningOp();
+    if (!isa_and_present<arith::ExtSIOp, arith::ExtUIOp, arith::ExtFOp>(
+            extendOp))
+      return rewriter.notifyMatchFailure(extractOp,
+                                         "extract not from extend op");
+
+    auto loc = extractOp.getLoc();
+    StringAttr extendOpName = extendOp->getName().getIdentifier();
+    Value extendSource = extendOp->getOperand(0);
+
+    // Create new extract from source of extend.
+    Value newExtract = rewriter.create<vector::ExtractOp>(
+        loc, extendSource, extractOp.getMixedPosition());
+
+    // Extend new extract to original result type.
+    Operation *newExtend =
+        rewriter.create(loc, extendOpName, Value(newExtract), resultType);
+
+    rewriter.replaceOp(extractOp, newExtend->getResult(0));
+
+    return success();
+  }
+};
+
+// Shuffles arith extend ops after vector.scalable.extract op.
+//
+// This transforms IR like:
+//   %0 = arith.extsi %src : vector<[8]xi8> to vector<[8]xi32>
+//   %1 = vector.scalable.extract %0[0] : vector<[4]xi32> from vector<[8]xi32>
+// Into:
+//   %0 = vector.scalable.extract %src[0] : vector<[4]xi8> from vector<[8]xi8>
+//   %1 = arith.extsi %0 : vector<[4]xi8> to vector<[4]xi32>
+//
+// This enables outer product fusion in the `-arm-sme-outer-product-fusion`
+// pass when the result is the input to an outer product.
+struct SwapVectorScalableExtractOfArithExtend
+    : public OpRewritePattern<vector::ScalableExtractOp> {
+  using OpRewritePattern::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(vector::ScalableExtractOp extractOp,
+                                PatternRewriter &rewriter) const override {
+    auto *extendOp = extractOp.getSource().getDefiningOp();
+    if (!isa_and_present<arith::ExtSIOp, arith::ExtUIOp, arith::ExtFOp>(
+            extendOp))
+      return rewriter.notifyMatchFailure(extractOp,
+                                         "extract not from extend op");
+
+    auto loc = extractOp.getLoc();
+    VectorType resultType = extractOp.getResultVectorType();
+
+    Value extendSource = extendOp->getOperand(0);
+    StringAttr extendOpName = extendOp->getName().getIdentifier();
+    VectorType extendSourceVectorType =
+        cast<VectorType>(extendSource.getType());
+
+    // Create new extract from source of extend.
+    VectorType extractResultVectorType =
+        VectorType::Builder(resultType)
+            .setElementType(extendSourceVectorType.getElementType());
+    Value newExtract = rewriter.create<vector::ScalableExtractOp>(
+        loc, extractResultVectorType, extendSource, extractOp.getPos());
+
+    // Extend new extract to original result type.
+    Operation *newExtend =
+        rewriter.create(loc, extendOpName, Value(newExtract), resultType);
+
+    rewriter.replaceOp(extractOp, newExtend->getResult(0));
+
+    return success();
+  }
+};
+
 struct VectorLegalizationPass
     : public arm_sme::impl::VectorLegalizationBase<VectorLegalizationPass> {
   void runOnOperation() override {
@@ -358,6 +457,8 @@ struct VectorLegalizationPass
           return success();
         });
 
+    patterns.add<SwapVectorExtractOfArithExtend,
+                 SwapVectorScalableExtractOfArithExtend>(context);
     // Note: High benefit to ensure masked outer products are lowered first.
     patterns.add<LegalizeMaskedVectorOuterProductOpsByDecomposition>(
         converter, context, 1024);
diff --git a/mlir/test/Dialect/ArmSME/vector-legalization.mlir b/mlir/test/Dialect/ArmSME/vector-legalization.mlir
index a20abeefedcfd..f9bc246273da6 100644
--- a/mlir/test/Dialect/ArmSME/vector-legalization.mlir
+++ b/mlir/test/Dialect/ArmSME/vector-legalization.mlir
@@ -266,3 +266,43 @@ func.func @transpose_f32_scalable_4x16_via_write(%src: memref<?x?xf32>, %dest: m
   vector.transfer_write %0, %dest[%c0, %c0] {permutation_map = #transpose, in_bounds = [true, true]} : vector<[4]x[16]xf32>, memref<?x?xf32>
   return
 }
+
+// -----
+
+// CHECK-LABEL: @extract_from_arith_ext(
+// CHECK-SAME:                          %[[SRC:.*]]: vector<4x[8]xi8>
+// CHECK: %[[EXTRACT:.*]] = vector.extract %[[SRC]][0] : vector<[8]xi8> from vector<4x[8]xi8>
+// CHECK: %[[EXTEND:.*]] = arith.extsi %[[EXTRACT]] : vector<[8]xi8> to vector<[8]xi32>
+// CHECK: return %[[EXTEND]]
+func.func @extract_from_arith_ext(%src: vector<4x[8]xi8>) -> vector<[8]xi32> {
+  %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
+  %1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32>
+  return %1 : vector<[8]xi32>
+}
+
+// -----
+
+// CHECK-LABEL: @non_constant_extract_from_arith_ext(
+// CHECK-SAME:                                       %[[SRC:[a-z0-9]+]]: vector<4x[8]xi8>,
+// CHECK-SAME:                                       %[[DIM:[a-z0-9]+]]: index
+// CHECK: %[[EXTRACT:.*]] = vector.extract %[[SRC]][%[[DIM]]] : vector<[8]xi8> from vector<4x[8]xi8>
+// CHECK: %[[EXTEND:.*]] = arith.extsi %[[EXTRACT]] : vector<[8]xi8> to vector<[8]xi32>
+// CHECK: return %[[EXTEND]]
+func.func @non_constant_extract_from_arith_ext(%src: vector<4x[8]xi8>, %dim: index) -> vector<[8]xi32> {
+  %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
+  %1 = vector.extract %0[%dim] : vector<[8]xi32> from vector<4x[8]xi32>
+  return %1 : vector<[8]xi32>
+}
+
+// -----
+
+// CHECK-LABEL: @scalable_extract_from_arith_ext(
+// CHECK-SAME:                                   %[[SRC:.*]]: vector<[8]xi8>
+// CHECK: %[[EXTRACT:.*]] = vector.scalable.extract %[[SRC]][0] : vector<[4]xi8> from vector<[8]xi8>
+// CHECK: %[[EXTEND:.*]] = arith.extsi %[[EXTRACT]] : vector<[4]xi8> to vector<[4]xi32>
+// CHECK: return %[[EXTEND]]
+func.func @scalable_extract_from_arith_ext(%src: vector<[8]xi8>) -> vector<[4]xi32> {
+  %0 = arith.extsi %src : vector<[8]xi8> to vector<[8]xi32>
+  %1 = vector.scalable.extract %0[0] : vector<[4]xi32> from vector<[8]xi32>
+  return %1 : vector<[4]xi32>
+}

llvmbot · 2024-02-02T08:55:12Z

@llvm/pr-subscribers-mlir

Author: Cullen Rhodes (c-rhodes)

Changes

In mixed matmul lowering (e.g., i8 to i32) we're seeing the following sequence:

%0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
%1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32>
%lhs = vector.scalable.extract %1[0] : vector<[4]xi32> from vector<[8]xi32>

... (same for rhs)

%2 = vector.outerproduct %lhs, %rhs, %acc vector<[4]xi32>, vector<[4]xi32>

// x4 chained by accumulator

This chain of 4 outer products can be fused into a single 4-way widening variant but the pass doesn't match on the IR, as it expects the source of the inputs to be an extend and it can't look through the extracts.

This patch fixes this with two rewrites that swaps extract(extend) into extend(extract).

Related to #78975, #79288.

Full diff: https://github.com/llvm/llvm-project/pull/80407.diff

2 Files Affected:

(modified) mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp (+101)
(modified) mlir/test/Dialect/ArmSME/vector-legalization.mlir (+40)

diff --git a/mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp b/mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp
index 85ec53c2618aa..40f5093c1404a 100644
--- a/mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp
+++ b/mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp
@@ -338,6 +338,105 @@ struct LegalizeTransferWriteOpsByDecomposition
   }
 };
 
+// Shuffles arith extend ops after vector.extract op.
+//
+// This transforms IR like:
+//   %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
+//   %1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32>
+// Into:
+//   %0 = vector.extract %src[0] : vector<[8]xi8> from vector<4x[8]xi8>
+//   %1 = arith.extsi %0 : vector<[8]xi8> to vector<[8]xi32>
+//
+// This enables outer product fusion in the `-arm-sme-outer-product-fusion`
+// pass when the result is the input to an outer product.
+struct SwapVectorExtractOfArithExtend
+    : public OpRewritePattern<vector::ExtractOp> {
+  using OpRewritePattern::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(vector::ExtractOp extractOp,
+                                PatternRewriter &rewriter) const override {
+    VectorType resultType = llvm::dyn_cast<VectorType>(extractOp.getType());
+    if (!resultType)
+      return rewriter.notifyMatchFailure(extractOp,
+                                         "extracted type is not a vector type");
+
+    auto numScalableDims = llvm::count(resultType.getScalableDims(), true);
+    if (numScalableDims != 1)
+      return rewriter.notifyMatchFailure(
+          extractOp, "extracted type is not a 1-D scalable vector type");
+
+    auto *extendOp = extractOp.getVector().getDefiningOp();
+    if (!isa_and_present<arith::ExtSIOp, arith::ExtUIOp, arith::ExtFOp>(
+            extendOp))
+      return rewriter.notifyMatchFailure(extractOp,
+                                         "extract not from extend op");
+
+    auto loc = extractOp.getLoc();
+    StringAttr extendOpName = extendOp->getName().getIdentifier();
+    Value extendSource = extendOp->getOperand(0);
+
+    // Create new extract from source of extend.
+    Value newExtract = rewriter.create<vector::ExtractOp>(
+        loc, extendSource, extractOp.getMixedPosition());
+
+    // Extend new extract to original result type.
+    Operation *newExtend =
+        rewriter.create(loc, extendOpName, Value(newExtract), resultType);
+
+    rewriter.replaceOp(extractOp, newExtend->getResult(0));
+
+    return success();
+  }
+};
+
+// Shuffles arith extend ops after vector.scalable.extract op.
+//
+// This transforms IR like:
+//   %0 = arith.extsi %src : vector<[8]xi8> to vector<[8]xi32>
+//   %1 = vector.scalable.extract %0[0] : vector<[4]xi32> from vector<[8]xi32>
+// Into:
+//   %0 = vector.scalable.extract %src[0] : vector<[4]xi8> from vector<[8]xi8>
+//   %1 = arith.extsi %0 : vector<[4]xi8> to vector<[4]xi32>
+//
+// This enables outer product fusion in the `-arm-sme-outer-product-fusion`
+// pass when the result is the input to an outer product.
+struct SwapVectorScalableExtractOfArithExtend
+    : public OpRewritePattern<vector::ScalableExtractOp> {
+  using OpRewritePattern::OpRewritePattern;
+
+  LogicalResult matchAndRewrite(vector::ScalableExtractOp extractOp,
+                                PatternRewriter &rewriter) const override {
+    auto *extendOp = extractOp.getSource().getDefiningOp();
+    if (!isa_and_present<arith::ExtSIOp, arith::ExtUIOp, arith::ExtFOp>(
+            extendOp))
+      return rewriter.notifyMatchFailure(extractOp,
+                                         "extract not from extend op");
+
+    auto loc = extractOp.getLoc();
+    VectorType resultType = extractOp.getResultVectorType();
+
+    Value extendSource = extendOp->getOperand(0);
+    StringAttr extendOpName = extendOp->getName().getIdentifier();
+    VectorType extendSourceVectorType =
+        cast<VectorType>(extendSource.getType());
+
+    // Create new extract from source of extend.
+    VectorType extractResultVectorType =
+        VectorType::Builder(resultType)
+            .setElementType(extendSourceVectorType.getElementType());
+    Value newExtract = rewriter.create<vector::ScalableExtractOp>(
+        loc, extractResultVectorType, extendSource, extractOp.getPos());
+
+    // Extend new extract to original result type.
+    Operation *newExtend =
+        rewriter.create(loc, extendOpName, Value(newExtract), resultType);
+
+    rewriter.replaceOp(extractOp, newExtend->getResult(0));
+
+    return success();
+  }
+};
+
 struct VectorLegalizationPass
     : public arm_sme::impl::VectorLegalizationBase<VectorLegalizationPass> {
   void runOnOperation() override {
@@ -358,6 +457,8 @@ struct VectorLegalizationPass
           return success();
         });
 
+    patterns.add<SwapVectorExtractOfArithExtend,
+                 SwapVectorScalableExtractOfArithExtend>(context);
     // Note: High benefit to ensure masked outer products are lowered first.
     patterns.add<LegalizeMaskedVectorOuterProductOpsByDecomposition>(
         converter, context, 1024);
diff --git a/mlir/test/Dialect/ArmSME/vector-legalization.mlir b/mlir/test/Dialect/ArmSME/vector-legalization.mlir
index a20abeefedcfd..f9bc246273da6 100644
--- a/mlir/test/Dialect/ArmSME/vector-legalization.mlir
+++ b/mlir/test/Dialect/ArmSME/vector-legalization.mlir
@@ -266,3 +266,43 @@ func.func @transpose_f32_scalable_4x16_via_write(%src: memref<?x?xf32>, %dest: m
   vector.transfer_write %0, %dest[%c0, %c0] {permutation_map = #transpose, in_bounds = [true, true]} : vector<[4]x[16]xf32>, memref<?x?xf32>
   return
 }
+
+// -----
+
+// CHECK-LABEL: @extract_from_arith_ext(
+// CHECK-SAME:                          %[[SRC:.*]]: vector<4x[8]xi8>
+// CHECK: %[[EXTRACT:.*]] = vector.extract %[[SRC]][0] : vector<[8]xi8> from vector<4x[8]xi8>
+// CHECK: %[[EXTEND:.*]] = arith.extsi %[[EXTRACT]] : vector<[8]xi8> to vector<[8]xi32>
+// CHECK: return %[[EXTEND]]
+func.func @extract_from_arith_ext(%src: vector<4x[8]xi8>) -> vector<[8]xi32> {
+  %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
+  %1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32>
+  return %1 : vector<[8]xi32>
+}
+
+// -----
+
+// CHECK-LABEL: @non_constant_extract_from_arith_ext(
+// CHECK-SAME:                                       %[[SRC:[a-z0-9]+]]: vector<4x[8]xi8>,
+// CHECK-SAME:                                       %[[DIM:[a-z0-9]+]]: index
+// CHECK: %[[EXTRACT:.*]] = vector.extract %[[SRC]][%[[DIM]]] : vector<[8]xi8> from vector<4x[8]xi8>
+// CHECK: %[[EXTEND:.*]] = arith.extsi %[[EXTRACT]] : vector<[8]xi8> to vector<[8]xi32>
+// CHECK: return %[[EXTEND]]
+func.func @non_constant_extract_from_arith_ext(%src: vector<4x[8]xi8>, %dim: index) -> vector<[8]xi32> {
+  %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32>
+  %1 = vector.extract %0[%dim] : vector<[8]xi32> from vector<4x[8]xi32>
+  return %1 : vector<[8]xi32>
+}
+
+// -----
+
+// CHECK-LABEL: @scalable_extract_from_arith_ext(
+// CHECK-SAME:                                   %[[SRC:.*]]: vector<[8]xi8>
+// CHECK: %[[EXTRACT:.*]] = vector.scalable.extract %[[SRC]][0] : vector<[4]xi8> from vector<[8]xi8>
+// CHECK: %[[EXTEND:.*]] = arith.extsi %[[EXTRACT]] : vector<[4]xi8> to vector<[4]xi32>
+// CHECK: return %[[EXTEND]]
+func.func @scalable_extract_from_arith_ext(%src: vector<[8]xi8>) -> vector<[4]xi32> {
+  %0 = arith.extsi %src : vector<[8]xi8> to vector<[8]xi32>
+  %1 = vector.scalable.extract %0[0] : vector<[4]xi32> from vector<[8]xi32>
+  return %1 : vector<[4]xi32>
+}

banach-space

Makes sense to me, thanks! Just a few small suggestions.

mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp

mlir/test/Dialect/ArmSME/vector-legalization.mlir

In mixed matmul lowering (e.g., i8 to i32) we're seeing the following sequence: %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32> %1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32> %lhs = vector.scalable.extract %1[0] : vector<[4]xi32> from vector<[8]xi32> ... (same for rhs) %2 = vector.outerproduct %lhs, %rhs, %acc vector<[4]xi32>, vector<[4]xi32> // x4 chained by accumulator This chain of 4 outer products can be fused into a single 4-way widening variant but the pass doesn't match on the IR, as it expects the source of the inputs to be an extend and it can't look through the extracts. This patch fixes this with two rewrites that swaps extract(extend) into extend(extract). Related to llvm#78975, llvm#79288.

mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp

Add negative tests.

MacDue

The rewrites look good, though I wonder if it'd make more sense to include these in the -arm-sme-outer-product-fusion pass instead? They'd work just as well there, and that would also ensure no pass between -arm-sme-vector-legalization and -arm-sme-outer-product-fusion moves them again (due to some random canonicalization).

c-rhodes · 2024-02-05T12:38:38Z

The rewrites look good, though I wonder if it'd make more sense to include these in the -arm-sme-outer-product-fusion pass instead? They'd work just as well there, and that would also ensure no pass between -arm-sme-vector-legalization and -arm-sme-outer-product-fusion moves them again (due to some random canonicalization).

Good point, I agree, will move them.

c-rhodes · 2024-02-05T13:33:13Z

The rewrites look good, though I wonder if it'd make more sense to include these in the -arm-sme-outer-product-fusion pass instead? They'd work just as well there, and that would also ensure no pass between -arm-sme-vector-legalization and -arm-sme-outer-product-fusion moves them again (due to some random canonicalization).

Good point, I agree, will move them.

Done 👍

In mixed matmul lowering (e.g., i8 to i32) we're seeing the following sequence: %0 = arith.extsi %src : vector<4x[8]xi8> to vector<4x[8]xi32> %1 = vector.extract %0[0] : vector<[8]xi32> from vector<4x[8]xi32> %lhs = vector.scalable.extract %1[0] : vector<[4]xi32> from vector<[8]xi32> ... (same for rhs) %2 = vector.outerproduct %lhs, %rhs, %acc vector<[4]xi32>, vector<[4]xi32> // x4 chained by accumulator This chain of 4 outer products can be fused into a single 4-way widening variant but the pass doesn't match on the IR, as it expects the source of the inputs to be an extend and it can't look through the extracts. This patch fixes this with two rewrites that swaps extract(extend) into extend(extract). Related to llvm#78975, llvm#79288.

c-rhodes requested a review from MacDue February 2, 2024 08:54

c-rhodes requested review from banach-space, dcaballe and nicolasvasilache as code owners February 2, 2024 08:54

llvmbot added mlir mlir:sme labels Feb 2, 2024

banach-space approved these changes Feb 5, 2024

View reviewed changes

mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp Outdated Show resolved Hide resolved

mlir/test/Dialect/ArmSME/vector-legalization.mlir Outdated Show resolved Hide resolved

MacDue reviewed Feb 5, 2024

View reviewed changes

mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp Outdated Show resolved Hide resolved

mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp Outdated Show resolved Hide resolved

mlir/lib/Dialect/ArmSME/Transforms/VectorLegalization.cpp Outdated Show resolved Hide resolved

Address comments

e26e836

c-rhodes force-pushed the mlir-reorder-vector-extract-arith-ext branch from c27b628 to e26e836 Compare February 5, 2024 12:05

Address comments.

c75c20a

Add negative tests.

MacDue approved these changes Feb 5, 2024

View reviewed changes

Move patterns to -arm-sme-outer-product-fusion

ada65df

c-rhodes merged commit 5f5b3bb into llvm:main Feb 5, 2024

c-rhodes deleted the mlir-reorder-vector-extract-arith-ext branch February 5, 2024 14:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][ArmSME] Add rewrites to swap extract of extend #80407

[mlir][ArmSME] Add rewrites to swap extract of extend #80407

Uh oh!

c-rhodes commented Feb 2, 2024

Uh oh!

llvmbot commented Feb 2, 2024

Uh oh!

llvmbot commented Feb 2, 2024

Uh oh!

banach-space left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MacDue left a comment

Uh oh!

c-rhodes commented Feb 5, 2024

Uh oh!

c-rhodes commented Feb 5, 2024

Uh oh!

Uh oh!

[mlir][ArmSME] Add rewrites to swap extract of extend #80407

[mlir][ArmSME] Add rewrites to swap extract of extend #80407

Uh oh!

Conversation

c-rhodes commented Feb 2, 2024

Uh oh!

llvmbot commented Feb 2, 2024

Uh oh!

llvmbot commented Feb 2, 2024

Uh oh!

banach-space left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MacDue left a comment

Choose a reason for hiding this comment

Uh oh!

c-rhodes commented Feb 5, 2024

Uh oh!

c-rhodes commented Feb 5, 2024

Uh oh!

Uh oh!