-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[mlir][vector] Determine vector sizes from the result shape for tensor.pad operation #97248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-mlir-linalg @llvm/pr-subscribers-mlir Author: xiaohui1.xu (BRUCE11111) Changes
Full diff: https://github.com/llvm/llvm-project/pull/97248.diff 3 Files Affected:
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 3a75d2ac08157..a826b153e979f 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1708,6 +1708,13 @@ vectorizeAsTensorPadOp(RewriterBase &rewriter, tensor::PadOp padOp,
.reifyResultShapes(rewriter, reifiedReturnShapes);
(void)status; // prevent unused variable warning on non-assert builds
assert(succeeded(status) && "failed to reify result shapes");
+
+ // If the input vector sizes are not provided, then the vector sizes are
+ // determined by the result tensor shape.
+ if (inputVectorSizes.empty()) {
+ inputVectorSizes =
+ mlir::dyn_cast<RankedTensorType>(padOp->getResultTypes()[0]).getShape();
+ }
auto maskedRead = vector::createReadOrMaskedRead(
rewriter, loc, padOp.getSource(), inputVectorSizes, padValue,
/*useInBoundsInsteadOfMasking=*/false);
@@ -1915,9 +1922,16 @@ vectorizePadOpPrecondition(tensor::PadOp padOp,
return failure();
}
+ bool satisfyEmptyCond = true;
+ if (inputVectorSizes.empty()) {
+ if (!mlir::dyn_cast<RankedTensorType>(padOp->getResultTypes()[0])
+ .hasStaticShape() ||
+ !padOp.getSourceType().hasStaticShape())
+ satisfyEmptyCond = false;
+ }
ArrayRef<int64_t> resultTensorShape = padOp.getResultType().getShape();
- if (failed(vector::isValidMaskedInputVector(resultTensorShape,
- inputVectorSizes)))
+ if (!satisfyEmptyCond && failed(vector::isValidMaskedInputVector(
+ resultTensorShape, inputVectorSizes)))
return failure();
if (llvm::any_of(padOp.getLow(), [](Value v) {
diff --git a/mlir/test/Dialect/Linalg/vectorization-unsupported.mlir b/mlir/test/Dialect/Linalg/vectorization-unsupported.mlir
index 5d3c07c8e23c1..ea7b81871ede3 100644
--- a/mlir/test/Dialect/Linalg/vectorization-unsupported.mlir
+++ b/mlir/test/Dialect/Linalg/vectorization-unsupported.mlir
@@ -126,3 +126,23 @@ module attributes {transform.with_named_sequence} {
transform.yield
}
}
+
+ // -----
+
+func.func @test_pad_no_vectorize_dynamic_shape(%arg0: tensor<?x?xf32>, %arg1: tensor<4x16xf32>) -> tensor<4x16xf32> {
+ %f0 = arith.constant 0.0 : f32
+ // expected-error @+1 {{Attempted to vectorize, but failed}}
+ %pad = tensor.pad %arg0 low[0, 0] high[1,1] {
+ ^bb0(%arg3: index, %arg4: index):
+ tensor.yield %f0 : f32
+ } : tensor<?x?xf32> to tensor<4x16xf32>
+ return %pad : tensor<4x16xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg0: !transform.any_op {transform.readonly}) {
+ %0 = transform.structured.match ops{["tensor.pad"]} in %arg0 : (!transform.any_op) -> !transform.any_op
+ transform.structured.vectorize %0 : !transform.any_op
+ transform.yield
+ }
+}
diff --git a/mlir/test/Dialect/Linalg/vectorization.mlir b/mlir/test/Dialect/Linalg/vectorization.mlir
index bbeccc7fecd68..e5e70ca40f300 100644
--- a/mlir/test/Dialect/Linalg/vectorization.mlir
+++ b/mlir/test/Dialect/Linalg/vectorization.mlir
@@ -1055,3 +1055,35 @@ func.func @test_vectorize_unpack_no_vector_sizes_permute(%source: tensor<4x7x4xf
transform.yield
}
}
+
+ // -----
+
+// CHECK-LABEL: test_vectorize_pad_no_vector_sizes
+func.func @test_vectorize_pad_no_vector_sizes(%arg0: tensor<63x63xf32>) -> tensor<64x64xf32> {
+ %f0 = arith.constant 0.0 : f32
+ %pad = tensor.pad %arg0 low[0, 0] high[1, 1] {
+ ^bb0(%arg1: index, %arg2: index):
+ tensor.yield %f0 : f32
+ } : tensor<63x63xf32> to tensor<64x64xf32>
+ return %pad : tensor<64x64xf32>
+}
+// CHECK-DAG: %[[cst:.*]] = arith.constant 0.000000e+00 : f32
+// CHECK-DAG: %[[c0:.*]] = arith.constant 0 : index
+// CHECK-DAG: %[[c63:.*]] = arith.constant 63 : index
+// CHECK-DAG: %[[c63_0:.*]] = arith.constant 63 : index
+// CHECK: %[[mask:.*]] = vector.create_mask %[[c63]], %[[c63_0]] : vector<64x64xi1>
+// CHECK: %[[read:.*]] = vector.mask %0 { vector.transfer_read {{.*}}, %cst {in_bounds = [true, true]}
+// CHECK-SAME : tensor<63x63xf32>, vector<64x64xf32> } : vector<64x64xi1> -> vector<64x64xf32>
+// CHECK: %[[empty:.*]] = tensor.empty() : tensor<64x64xf32>
+// CHECK: %[[c0_1:.*]] = arith.constant 0 : index
+// CHECK: %[[result:.*]] = vector.transfer_write %[[read]], {{.*}} {in_bounds = [true, true]} :
+// CHECK-SAME : vector<64x64xf32>, tensor<64x64xf32>
+// CHECK: return %[[result:.*]] : tensor<64x64xf32>
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg0: !transform.any_op {transform.readonly}) {
+ %0 = transform.structured.match ops{["tensor.pad"]} in %arg0 : (!transform.any_op) -> !transform.any_op
+ transform.structured.vectorize %0 : !transform.any_op
+ transform.yield
+ }
+}
|
// CHECK: %[[mask:.*]] = vector.create_mask %[[c63]], %[[c63_0]] : vector<64x64xi1> | ||
// CHECK: %[[read:.*]] = vector.mask %0 { vector.transfer_read {{.*}}, %cst {in_bounds = [true, true]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach is using in_bounds set to false when input vector sizes are not provided. Masking should only be used when input vector sizes are provided. Eventually we may get rid of in_bounds that when we would have to emulate its behavior from a masked operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi~ Thanks for the advice~. In order to get better code implementation, I have a question here.
First I don't see any essential difference between inferring the vectorize size from the result shape and setting the vectorize size directly. They both provide vector size finally.
Second, I try to figure out why we prefer to use in_bounds
other than mask
, I may find some reasons here.
For transfer_read
or transfer_write
operations, when we lower them to llvm (code link), we can find that we can't support mask operation perfectly now. Sometimes we need a lowering pattern can lowering mask read
to just read
like code.
But when the input vector sizes are provided why we can use mask operation?
A more precise question is, under what conditions can we safely use mask to mask operation when writing code?
Thanks~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First I don't see any essential difference between inferring the vectorize size from the result shape and setting the vectorize size directly. They both provide vector size finally.
How do you infer sizes when the input shapes are dynamic?
I try to figure out why we prefer to use
in_bounds
other thanmask
, I may find some reasons here.
There's a lot of discussion about the semantics in_bounds
ATM. I am trying to clarify some of it here:
But when the input vector sizes are provided why we can use mask operation?
I do believe that masks and in_bounds
model slightly different things, but the distinction is not well documented ATM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is lacking motivation.
Why not use GenericPadOpVectorizationPattern
and lower to linalg.fill
instead? That's already available upstream.
// CHECK: %[[mask:.*]] = vector.create_mask %[[c63]], %[[c63_0]] : vector<64x64xi1> | ||
// CHECK: %[[read:.*]] = vector.mask %0 { vector.transfer_read {{.*}}, %cst {in_bounds = [true, true]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First I don't see any essential difference between inferring the vectorize size from the result shape and setting the vectorize size directly. They both provide vector size finally.
How do you infer sizes when the input shapes are dynamic?
I try to figure out why we prefer to use
in_bounds
other thanmask
, I may find some reasons here.
There's a lot of discussion about the semantics in_bounds
ATM. I am trying to clarify some of it here:
But when the input vector sizes are provided why we can use mask operation?
I do believe that masks and in_bounds
model slightly different things, but the distinction is not well documented ATM.
This PR is about the method in here and here. This PR is just a fix for that. No other motivation.
Maybe it should be
In the empty input size case, it must be static shape : https://github.com/BRUCE11111/llvm-project/blob/0653099322a7e96a693c6739cf9b640a4c1fca19/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp#L1926C3-L1931C4
Great work. I will read it carefully.
I think so. But I still can't understand in current method |
tensor.pad
operation also need to support emptyinputVectorSizes
to do the vectorization.