-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[mlir][vector] Make TransposeOpLowering
configurable
#73915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][vector] Make TransposeOpLowering
configurable
#73915
Conversation
Following the discussion here: * llvm#72105 this patch makes the `TransposeOpLowering` configurable so that one can select whether to favour `vector.shape_cast` over `vector.transpose`. As per the discussion in llvm#72105, using `vector.shape_cast` is very beneficial and desirable when targeting `LLVM IR` (CPU lowering), but simply won't work when targeting `SPIR-V` (GPU lowering). So we need a mechanism to be able to disable/enable the pattern introduced in llvm#72105. This patch proposes one such mechanism. While this should solve the problem that we are facing today, we may need to introduce something more elaborate to specialise for CPU vs GPU lowering. Also, (once implemented) this proposal might make this workaround redundant: * https://discourse.llvm.org/t/improving-handling-of-unit-dimensions-in-the-vector-dialect/
@llvm/pr-subscribers-mlir-vector Author: Andrzej Warzyński (banach-space) ChangesFollowing the discussion here:
As per the discussion in #72105, using While this should solve the problem that we are facing today, we may Full diff: https://github.com/llvm/llvm-project/pull/73915.diff 2 Files Affected:
diff --git a/mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h b/mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
index 08d3bb157a0e396..8f300d66c9a18f2 100644
--- a/mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
+++ b/mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
@@ -59,6 +59,8 @@ struct VectorTransformsOptions {
vectorTransferSplit = opt;
return *this;
}
+
+ bool useShapeCast = true;
};
//===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTranspose.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTranspose.cpp
index 97f6caca1b25ccc..4d43a76c4a4efcc 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTranspose.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTranspose.cpp
@@ -334,22 +334,24 @@ class TransposeOpLowering : public OpRewritePattern<vector::TransposeOp> {
return rewriter.notifyMatchFailure(
op, "Options specifies lowering to shuffle");
- // Replace:
- // vector.transpose %0, [1, 0] : vector<nx1x<eltty>> to
- // vector<1xnxelty>
- // with:
- // vector.shape_cast %0 : vector<nx1x<eltty>> to vector<1xnxelty>
- //
- // Source with leading unit dim (inverse) is also replaced. Unit dim must
- // be fixed. Non-unit can be scalable.
- if (resType.getRank() == 2 &&
- ((resType.getShape().front() == 1 &&
- !resType.getScalableDims().front()) ||
- (resType.getShape().back() == 1 &&
- !resType.getScalableDims().back())) &&
- transp == ArrayRef<int64_t>({1, 0})) {
- rewriter.replaceOpWithNewOp<vector::ShapeCastOp>(op, resType, input);
- return success();
+ if (vectorTransformOptions.useShapeCast) {
+ // Replace:
+ // vector.transpose %0, [1, 0] : vector<nx1x<eltty>> to
+ // vector<1xnxelty>
+ // with:
+ // vector.shape_cast %0 : vector<nx1x<eltty>> to vector<1xnxelty>
+ //
+ // Source with leading unit dim (inverse) is also replaced. Unit dim must
+ // be fixed. Non-unit can be scalable.
+ if (resType.getRank() == 2 &&
+ ((resType.getShape().front() == 1 &&
+ !resType.getScalableDims().front()) ||
+ (resType.getShape().back() == 1 &&
+ !resType.getScalableDims().back())) &&
+ transp == ArrayRef<int64_t>({1, 0})) {
+ rewriter.replaceOpWithNewOp<vector::ShapeCastOp>(op, resType, input);
+ return success();
+ }
}
if (inputType.isScalable())
|
@llvm/pr-subscribers-mlir Author: Andrzej Warzyński (banach-space) ChangesFollowing the discussion here:
As per the discussion in #72105, using While this should solve the problem that we are facing today, we may Full diff: https://github.com/llvm/llvm-project/pull/73915.diff 2 Files Affected:
diff --git a/mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h b/mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
index 08d3bb157a0e396..8f300d66c9a18f2 100644
--- a/mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
+++ b/mlir/include/mlir/Dialect/Vector/Transforms/VectorTransforms.h
@@ -59,6 +59,8 @@ struct VectorTransformsOptions {
vectorTransferSplit = opt;
return *this;
}
+
+ bool useShapeCast = true;
};
//===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTranspose.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTranspose.cpp
index 97f6caca1b25ccc..4d43a76c4a4efcc 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTranspose.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTranspose.cpp
@@ -334,22 +334,24 @@ class TransposeOpLowering : public OpRewritePattern<vector::TransposeOp> {
return rewriter.notifyMatchFailure(
op, "Options specifies lowering to shuffle");
- // Replace:
- // vector.transpose %0, [1, 0] : vector<nx1x<eltty>> to
- // vector<1xnxelty>
- // with:
- // vector.shape_cast %0 : vector<nx1x<eltty>> to vector<1xnxelty>
- //
- // Source with leading unit dim (inverse) is also replaced. Unit dim must
- // be fixed. Non-unit can be scalable.
- if (resType.getRank() == 2 &&
- ((resType.getShape().front() == 1 &&
- !resType.getScalableDims().front()) ||
- (resType.getShape().back() == 1 &&
- !resType.getScalableDims().back())) &&
- transp == ArrayRef<int64_t>({1, 0})) {
- rewriter.replaceOpWithNewOp<vector::ShapeCastOp>(op, resType, input);
- return success();
+ if (vectorTransformOptions.useShapeCast) {
+ // Replace:
+ // vector.transpose %0, [1, 0] : vector<nx1x<eltty>> to
+ // vector<1xnxelty>
+ // with:
+ // vector.shape_cast %0 : vector<nx1x<eltty>> to vector<1xnxelty>
+ //
+ // Source with leading unit dim (inverse) is also replaced. Unit dim must
+ // be fixed. Non-unit can be scalable.
+ if (resType.getRank() == 2 &&
+ ((resType.getShape().front() == 1 &&
+ !resType.getScalableDims().front()) ||
+ (resType.getShape().back() == 1 &&
+ !resType.getScalableDims().back())) &&
+ transp == ArrayRef<int64_t>({1, 0})) {
+ rewriter.replaceOpWithNewOp<vector::ShapeCastOp>(op, resType, input);
+ return success();
+ }
}
if (inputType.isScalable())
|
As #72105 has been reverted in IREE's fork of LLVM link, we are quite blocked in terms of supporting e2e lowering to SME in IREE (we are actually super close). So I am very keen to revert that ... revert :) @MaheshRavishankar and I chatted about this offline and, IIUC, this could be one solution that would work for everyone for the time being. And, IIUC, others were also OK with making this pattern "conditional". But I might have missed something obvious or misread your comments. Please let me know :) This patch would still require a follow-up in IREE to flip this condition to "false", but that's tangential to this change. Thank you for taking a look, |
Have the IREE folks looked into recognize that a vector_shape would be a transpose? (or undoing the transformation?) |
We could also implement this lowering using a |
No, there's no way to represent this with vector.extract/insert, due to scalability it'd need a loop. Like:
(also that still keep around the invalid For the issue we're trying to solve though it looks like we could add a fold
Which could fold to:
This avoids adding a |
Ouch! Not sure why I expected
I think the source of the problem is that SPIRV-V can't currently handle |
Yes, but the suggested fold is not adding a new Implemented here: #73951 |
Just to clarify this is nothing to do with IREE. This is an issue hit in IREE cause shape_cast cannot be handled on conversion to SpIRV in MLIR. |
Could you clarify why this is an issue. I thought one of the problems with vector lowering was about not having control over how things are lowered. This seems to be along those lines. I think it is impossible to have all back ends handle all vector lowerings. I understand we don't want to diverge too much, but some amount of control exists already to control how vector lowering work based on user situation (outer product vs dot product, etc) so this is adding to that. If we want this to be supported on all back ends, then this needs to be verified that any vector lowering can lower on all back ends before it is added on the general path? |
Sorry, perhaps my comment was misleading. My concern is more about starting to avoid certain vector operations when targeting one backend, making that backiend compatible with only a subset of the vector dialect. It's not about making all the lowerings work for all the backends, it's about having at least one lowering that can lower each vector ops to each backend. |
Yeah, make sense... but this is a missing piece, and kind of a chicken and egg problem. If it is done without control and used on any other backend where it is not supported, we run into a failure (as we did here). So if we want to make this supported on all backends, before this gets added without control, we need to verify it works on all backends (i.e. doesnt crash). Right now this is introducing a crash along one backend due to missing support. So might be "an order of work" thing. To make this be unconditional and used always need to get the SPIR-V backend to handle it. Till that is done/scoped out, this needs some control to opt-out? |
Thanks, someone pointed a downstream sharding analysis with affine maps which I thought was also an issue. So we can just match this in upstream lowering? Do we have a repro for the SPIRV lowering. |
I think that was a different discussion of why the
Yes. Let me try to summarize the repro in MLIR. Take any of the outputs that are generated from the tests added here https://github.com/llvm/llvm-project/pull/72105/files#diff-ec70fcfbae18679d0f257f9112001765e0306ef8dbcf152956b46d50862a4223R798 , say
If you run What we have is basically a vector lowering that introduces shape_cast irrespective of the backend being used, without it being supported on all backends. Thats seems like a bad situation to be in. |
Sure, let's fix the incomplete backend. |
Absolutely! No judgement here. Lot of us contribute/maintain lot of these backends (especially SPIR-V). Should try to fix the incomplete backend, but we should not make things that hit this part of the default flow that downstream users cannot opt-out of right? I have no idea how much work it is to finish this on SPIR-V side. It seems reasonable to me to co-ordinate and fix the incompleteness in the backends before adding something that is always on. |
I will just re-iterate:
Now, trying to figure out how to move this forward:
My expectation is that (eventually) the SPIR-V community will implement the missing lowerings for If my assumption is not correct then either the Vector dialect needs to be updated (*) or the conclusion is that SPIR-V is not compatible with Vector. To me, adding support for lowering
True, but the opposite would mean limiting what the LLVM IR backend can use due to limitations of the SPIR-V backend. I know that that's not what you are suggesting, but what if there was a 3rd backend with even more limitation than SPIR-V? Should that mean limiting both LLVM IR and SPIR-V backends even further?
IMHO this is a fair request and I support it. We can "guard" this opt-out mechanism with comments pointing at this discussion and discouraging from re-using it beyond this case. However, there should also be commitment from SPIR-V to fill the gaps in the existing lowerings. Or some roadmap with alternatives. That would really help the discussion. Just to clarify - this patch wasn't meant as the solution, but merely as a workaround. -Andrzej (*) That's being proposed here: https://discourse.llvm.org/t/improving-handling-of-unit-dimensions-in-the-vector-dialect/ |
Thanks everyone and sorry about the late reply--I am really overwhelmed by some company internal tasks due to upcoming events so wasn't able to look into it as the absolute top priority. I guess I can be categorized into one of the SPIR-V folks in MLIR. :) If we want to be specific, we only have Jakub and me as the daily maintainer recently there. Compared to the number of people actively helping on plumbing LLVM flow, I think we are certainly not that resource abudunt. :) Both of us are fully occupied with company upcoming launch events at the moment. Hopefully this explains a bit why there are delays and we may want to temporary resort to workarounds--I don't want people to be under the impression that "SPIR-V folks" are just blocking everybody. FWIW, as @MaheshRavishankar said, we also contribute to a lot of other layers in MLIR, given that SPIR-V is just an exit and we prefer to handle issues at the proper layer. We all work on the same codebase. :) I totally understand the desire to have a clean codebase in MLIR and that's what all we need to contribute to. FWIW, we have been helping to clean up quite a lot of things along the whole history of MLIR. I'd certainly make sure this is the top priority for me once I'm more freed. I really don't want to make @banach-space and other to continue waiting due to my inability to take care of this right now so I hope this is an okay intermedidate state before Dec 6th. FWIW, we already have 4 different ways to control how Let's move this forward please? :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my perspective this looks like the right path to take now. So looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one comment:
Add comment and a setter
I have updated the summary and added some comments to explain that this is a workaround specifically for SPIR-V. And that we would be removing this once we can lower I would like to merge this today, so please comment if I misread something or you still feel that we shouldn't be landing this. Btw, I really appreciate this discussion - it is helping us to build a better compiler for all 🙏🏻 ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also LGTM for time being, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, merging today sounds good!
Has anybody tried this already? ^ |
Yeah thanks for pointing out! Agreed looks nice to check out. As said in the above, we are overwhelmed by comany internal tasks. This would be the first task I'd do after Dec 6, if noboday bets me to it.. |
…73915)" Reverting a workaround intended specifically for SPRI-V. That workaround emerged from this discussion: * llvm#72105 AFAIK, it hasn't been required in practice. This is based on IREE (https://github.com/openxla/iree), which has just bumped it's fork of LLVM without using it (*). (*) iree-org/iree@cef31e7 This reverts commit bbd2b08.
…" (#75062) Reverting a workaround intended specifically for SPRI-V. That workaround emerged from this discussion: * #72105 AFAIK, it hasn't been required in practice. This is based on IREE (https://github.com/openxla/iree), which has just bumped it's fork of LLVM without using it (*). (*) iree-org/iree@cef31e7 This reverts commit bbd2b08.
Following the discussion here:
this patch makes the
TransposeOpLowering
configurable so that one can selectwhether to favour
vector.shape_cast
overvector.transpose
.As per the discussion in #72105, using
vector.shape_cast
is very beneficialand desirable when targeting
LLVM IR
(CPU lowering), but won't work whentargeting
SPIR-V
today (GPU lowering). Hence the need for a mechanism to beable to disable/enable the pattern introduced in #72105. This patch proposes one
such mechanism.
While this should solve the problem that we are facing today, it's understood to
be a temporary workaround. It should be removed once support for lowering
vector.shape_cast
to SPIR-V is added. Also, (once implemented) the followingproposal might make this workaround redundant: