-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[mlir][vectorize] Support affine.apply in SuperVectorize #77968
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir-affine Author: Hsiangkai Wang (Hsiangkai) ChangesWe have no need to vectorize affine.apply inside the vectorizing loop. However, we still need to generate it in the original scalar form. We have to replace all its operands with the generated scalar operands in the vectorizing loop, e.g., induction variables. Full diff: https://github.com/llvm/llvm-project/pull/77968.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
index 6b7a157925fae1..3b70618fe151c5 100644
--- a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
+++ b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
@@ -721,8 +721,7 @@ struct VectorizationState {
/// Example:
/// * 'replaced': induction variable of a loop to be vectorized.
/// * 'replacement': new induction variable in the new vector loop.
- void registerValueScalarReplacement(BlockArgument replaced,
- BlockArgument replacement);
+ void registerValueScalarReplacement(Value replaced, Value replacement);
/// Registers the scalar replacement of a scalar result returned from a
/// reduction loop. 'replacement' must be scalar.
@@ -854,8 +853,8 @@ void VectorizationState::registerValueVectorReplacementImpl(Value replaced,
/// Example:
/// * 'replaced': induction variable of a loop to be vectorized.
/// * 'replacement': new induction variable in the new vector loop.
-void VectorizationState::registerValueScalarReplacement(
- BlockArgument replaced, BlockArgument replacement) {
+void VectorizationState::registerValueScalarReplacement(Value replaced,
+ Value replacement) {
registerValueScalarReplacementImpl(replaced, replacement);
}
@@ -978,6 +977,28 @@ static arith::ConstantOp vectorizeConstant(arith::ConstantOp constOp,
return newConstOp;
}
+/// We have no need to vectorize affine.apply. However, we still need to
+/// generate it and replace the operands with values in valueScalarReplacement.
+static Operation *vectorizeAffineApplyOp(AffineApplyOp applyOp,
+ VectorizationState &state) {
+ SmallVector<Value, 8> updatedOperands;
+ for (Value operand : applyOp.getOperands()) {
+ Value updatedOperand = operand;
+ if (state.valueScalarReplacement.contains(operand)) {
+ updatedOperand = state.valueScalarReplacement.lookupOrDefault(operand);
+ }
+ updatedOperands.push_back(updatedOperand);
+ }
+
+ auto newApplyOp = state.builder.create<AffineApplyOp>(
+ applyOp.getLoc(), applyOp.getAffineMap(), updatedOperands);
+
+ // Register the new affine.apply result.
+ state.registerValueScalarReplacement(applyOp.getResult(),
+ newApplyOp.getResult());
+ return newApplyOp;
+}
+
/// Creates a constant vector filled with the neutral elements of the given
/// reduction. The scalar type of vector elements will be taken from
/// `oldOperand`.
@@ -1493,6 +1514,8 @@ static Operation *vectorizeOneOperation(Operation *op,
return vectorizeAffineYieldOp(yieldOp, state);
if (auto constant = dyn_cast<arith::ConstantOp>(op))
return vectorizeConstant(constant, state);
+ if (auto applyOp = dyn_cast<AffineApplyOp>(op))
+ return vectorizeAffineApplyOp(applyOp, state);
// Other ops with regions are not supported.
if (op->getNumRegions() != 0)
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir
new file mode 100644
index 00000000000000..588663e1f97b61
--- /dev/null
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir
@@ -0,0 +1,23 @@
+// RUN: mlir-opt %s -affine-super-vectorize="virtual-vector-size=8 test-fastest-varying=0" -split-input-file | FileCheck %s
+
+// CHECK-DAG: #[[$map_id0:map[0-9a-zA-Z_]*]] = affine_map<(d0) -> (d0 mod 12)>
+// CHECK-DAG: #[[$map_id1:map[0-9a-zA-Z_]*]] = affine_map<(d0) -> (d0 mod 16)>
+
+// CHECK-LABEL: func @vec_affine_apply
+func.func @vec_affine_apply(%arg0: memref<8x12x16xf32>, %arg1: memref<8x24x48xf32>) {
+ affine.for %arg2 = 0 to 8 {
+// CHECK: affine.for %[[S0:.*]] = 0 to 24 {
+// CHECK-NEXT: affine.for %[[S1:.*]] = 0 to 48 step 8 {
+ affine.for %arg3 = 0 to 24 {
+ affine.for %arg4 = 0 to 48 {
+// CHECK-NEXT: affine.apply #[[$map_id0]](%[[S0]])
+// CHECK-NEXT: affine.apply #[[$map_id1]](%[[S1]])
+ %0 = affine.apply affine_map<(d0) -> (d0 mod 12)>(%arg3)
+ %1 = affine.apply affine_map<(d0) -> (d0 mod 16)>(%arg4)
+ %2 = affine.load %arg0[%arg2, %0, %1] : memref<8x12x16xf32>
+ affine.store %2, %arg1[%arg2, %arg3, %arg4] : memref<8x24x48xf32>
+ }
+ }
+ }
+ return
+}
|
@llvm/pr-subscribers-mlir Author: Hsiangkai Wang (Hsiangkai) ChangesWe have no need to vectorize affine.apply inside the vectorizing loop. However, we still need to generate it in the original scalar form. We have to replace all its operands with the generated scalar operands in the vectorizing loop, e.g., induction variables. Full diff: https://github.com/llvm/llvm-project/pull/77968.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
index 6b7a157925fae1..3b70618fe151c5 100644
--- a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
+++ b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
@@ -721,8 +721,7 @@ struct VectorizationState {
/// Example:
/// * 'replaced': induction variable of a loop to be vectorized.
/// * 'replacement': new induction variable in the new vector loop.
- void registerValueScalarReplacement(BlockArgument replaced,
- BlockArgument replacement);
+ void registerValueScalarReplacement(Value replaced, Value replacement);
/// Registers the scalar replacement of a scalar result returned from a
/// reduction loop. 'replacement' must be scalar.
@@ -854,8 +853,8 @@ void VectorizationState::registerValueVectorReplacementImpl(Value replaced,
/// Example:
/// * 'replaced': induction variable of a loop to be vectorized.
/// * 'replacement': new induction variable in the new vector loop.
-void VectorizationState::registerValueScalarReplacement(
- BlockArgument replaced, BlockArgument replacement) {
+void VectorizationState::registerValueScalarReplacement(Value replaced,
+ Value replacement) {
registerValueScalarReplacementImpl(replaced, replacement);
}
@@ -978,6 +977,28 @@ static arith::ConstantOp vectorizeConstant(arith::ConstantOp constOp,
return newConstOp;
}
+/// We have no need to vectorize affine.apply. However, we still need to
+/// generate it and replace the operands with values in valueScalarReplacement.
+static Operation *vectorizeAffineApplyOp(AffineApplyOp applyOp,
+ VectorizationState &state) {
+ SmallVector<Value, 8> updatedOperands;
+ for (Value operand : applyOp.getOperands()) {
+ Value updatedOperand = operand;
+ if (state.valueScalarReplacement.contains(operand)) {
+ updatedOperand = state.valueScalarReplacement.lookupOrDefault(operand);
+ }
+ updatedOperands.push_back(updatedOperand);
+ }
+
+ auto newApplyOp = state.builder.create<AffineApplyOp>(
+ applyOp.getLoc(), applyOp.getAffineMap(), updatedOperands);
+
+ // Register the new affine.apply result.
+ state.registerValueScalarReplacement(applyOp.getResult(),
+ newApplyOp.getResult());
+ return newApplyOp;
+}
+
/// Creates a constant vector filled with the neutral elements of the given
/// reduction. The scalar type of vector elements will be taken from
/// `oldOperand`.
@@ -1493,6 +1514,8 @@ static Operation *vectorizeOneOperation(Operation *op,
return vectorizeAffineYieldOp(yieldOp, state);
if (auto constant = dyn_cast<arith::ConstantOp>(op))
return vectorizeConstant(constant, state);
+ if (auto applyOp = dyn_cast<AffineApplyOp>(op))
+ return vectorizeAffineApplyOp(applyOp, state);
// Other ops with regions are not supported.
if (op->getNumRegions() != 0)
diff --git a/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir
new file mode 100644
index 00000000000000..588663e1f97b61
--- /dev/null
+++ b/mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir
@@ -0,0 +1,23 @@
+// RUN: mlir-opt %s -affine-super-vectorize="virtual-vector-size=8 test-fastest-varying=0" -split-input-file | FileCheck %s
+
+// CHECK-DAG: #[[$map_id0:map[0-9a-zA-Z_]*]] = affine_map<(d0) -> (d0 mod 12)>
+// CHECK-DAG: #[[$map_id1:map[0-9a-zA-Z_]*]] = affine_map<(d0) -> (d0 mod 16)>
+
+// CHECK-LABEL: func @vec_affine_apply
+func.func @vec_affine_apply(%arg0: memref<8x12x16xf32>, %arg1: memref<8x24x48xf32>) {
+ affine.for %arg2 = 0 to 8 {
+// CHECK: affine.for %[[S0:.*]] = 0 to 24 {
+// CHECK-NEXT: affine.for %[[S1:.*]] = 0 to 48 step 8 {
+ affine.for %arg3 = 0 to 24 {
+ affine.for %arg4 = 0 to 48 {
+// CHECK-NEXT: affine.apply #[[$map_id0]](%[[S0]])
+// CHECK-NEXT: affine.apply #[[$map_id1]](%[[S1]])
+ %0 = affine.apply affine_map<(d0) -> (d0 mod 12)>(%arg3)
+ %1 = affine.apply affine_map<(d0) -> (d0 mod 16)>(%arg4)
+ %2 = affine.load %arg0[%arg2, %0, %1] : memref<8x12x16xf32>
+ affine.store %2, %arg1[%arg2, %arg3, %arg4] : memref<8x24x48xf32>
+ }
+ }
+ }
+ return
+}
|
Ping. |
I barely remember this code but IIRC we were able to deal with affine.apply ops. @sergei-grechanik ? |
In the current implementation, the vectorizer will fail. The debug log is
|
999d525
to
7bdcc9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about handling affine.apply like this because it's recommended to fold affine.apply ops into load/store ops anyway.
7bdcc9f
to
24f52eb
Compare
Ping. |
I think the crash I've mentioned should be fixed, otherwise it would be a regression (I think it doesn't crash without the patch, just avoids vectorization). |
24f52eb
to
788a7f1
Compare
I fixed it and added two more test cases to verify it. Please help me to review this patch. Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a minor nit, otherwise lgtm
We have no need to vectorize affine.apply inside the vectorizing loop. However, we still need to generate it in the original scalar form. We have to replace all its operands with the generated scalar operands in the vectorizing loop, e.g., induction variables.
788a7f1
to
f216992
Compare
We have no need to vectorize affine.apply inside the vectorizing loop. However, we still need to generate it in the original scalar form. We have to replace all its operands with the generated scalar operands in the vectorizing loop, e.g., induction variables.