Skip to content

[mlir][Vector] Add vector.extract(vector.shuffle) folder #115105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 7, 2024

Conversation

dcaballe
Copy link
Contributor

@dcaballe dcaballe commented Nov 6, 2024

This PR adds a folder for extracting an element from a vector shuffle. It turns something like:

   %shuffle = vector.shuffle %a, %b [0, 8, 7, 15]
     : vector<8xf32>, vector<8xf32>
   %extract = vector.extract %shuffle[3] : f32 from vector<4xf32>

into:

   %extract = vector.extract %b[7] : f32 from vector<8xf32>

This PR adds a folder for extracting an element from a vector shuffle.
It turns something like:

```
   %shuffle = vector.shuffle %a, %b [0, 8, 7, 15]
     : vector<8xf32>, vector<8xf32>
   %extract = vector.extract %shuffle[3] : f32 from vector<4xf32>
```

into:

```
   %extract = vector.extract %b[7] : f32 from vector<4xf32>
```
@llvmbot
Copy link
Member

llvmbot commented Nov 6, 2024

@llvm/pr-subscribers-mlir

Author: Diego Caballero (dcaballe)

Changes

This PR adds a folder for extracting an element from a vector shuffle. It turns something like:

   %shuffle = vector.shuffle %a, %b [0, 8, 7, 15]
     : vector&lt;8xf32&gt;, vector&lt;8xf32&gt;
   %extract = vector.extract %shuffle[3] : f32 from vector&lt;4xf32&gt;

into:

   %extract = vector.extract %b[7] : f32 from vector&lt;4xf32&gt;

Full diff: https://github.com/llvm/llvm-project/pull/115105.diff

2 Files Affected:

  • (modified) mlir/lib/Dialect/Vector/IR/VectorOps.cpp (+42)
  • (modified) mlir/test/Dialect/Vector/canonicalize.mlir (+18)
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index d8913251e56e9e..723044aa2b66e4 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -1705,6 +1705,46 @@ static Value foldExtractFromBroadcast(ExtractOp extractOp) {
   return extractOp.getResult();
 }
 
+/// Fold extractOp coming from ShuffleOp.
+///
+/// Example:
+///
+///   %shuffle = vector.shuffle %a, %b [0, 8, 7, 15]
+///     : vector<8xf32>, vector<8xf32>
+///   %extract = vector.extract %shuffle[3] : f32 from vector<4xf32>
+/// ->
+///   %extract = vector.extract %b[7] : f32 from vector<4xf32>
+///
+static Value foldExtractFromShuffle(ExtractOp extractOp) {
+  // TODO: Canonicalization for dynamic position not implemented yet.
+  if (extractOp.hasDynamicPosition())
+    return Value();
+
+  auto shuffleOp = extractOp.getVector().getDefiningOp<ShuffleOp>();
+  if (!shuffleOp)
+    return Value();
+
+  // TODO: 0-D or multi-dimensional vectors not supported yet.
+  if (shuffleOp.getResultVectorType().getRank() != 1)
+    return Value();
+
+  int64_t inputVecSize = shuffleOp.getV1().getType().getShape()[0];
+  auto shuffleMask = shuffleOp.getMask();
+  int64_t extractIdx = extractOp.getStaticPosition()[0];
+  int64_t shuffleIdx = shuffleMask[extractIdx];
+
+  // Find the shuffled vector to extract from based on the shuffle index.
+  if (shuffleIdx < inputVecSize) {
+    extractOp.setOperand(0, shuffleOp.getV1());
+    extractOp.setStaticPosition({shuffleIdx});
+  } else {
+    extractOp.setOperand(0, shuffleOp.getV2());
+    extractOp.setStaticPosition({shuffleIdx - inputVecSize});
+  }
+
+  return extractOp.getResult();
+}
+
 // Fold extractOp with source coming from ShapeCast op.
 static Value foldExtractFromShapeCast(ExtractOp extractOp) {
   // TODO: Canonicalization for dynamic position not implemented yet.
@@ -1953,6 +1993,8 @@ OpFoldResult ExtractOp::fold(FoldAdaptor) {
     return res;
   if (auto res = foldExtractFromBroadcast(*this))
     return res;
+  if (auto res = foldExtractFromShuffle(*this))
+    return res;
   if (auto res = foldExtractFromShapeCast(*this))
     return res;
   if (auto val = foldExtractFromExtractStrided(*this))
diff --git a/mlir/test/Dialect/Vector/canonicalize.mlir b/mlir/test/Dialect/Vector/canonicalize.mlir
index df87f86765a3a3..5ae769090dac66 100644
--- a/mlir/test/Dialect/Vector/canonicalize.mlir
+++ b/mlir/test/Dialect/Vector/canonicalize.mlir
@@ -740,6 +740,24 @@ func.func @fold_extract_broadcast(%a : vector<1xf32>) -> vector<8xf32> {
   %r = vector.extract %b[0] : vector<8xf32> from vector<1x8xf32>
   return %r : vector<8xf32>
 }
+// -----
+
+// CHECK-LABEL: @fold_extract_shuffle
+//  CHECK-SAME:   %[[A:.*]]: vector<8xf32>, %[[B:.*]]: vector<8xf32>
+//   CHECK-NOT:   vector.shuffle
+//       CHECK:   vector.extract %[[A]][0] : f32 from vector<8xf32>
+//       CHECK:   vector.extract %[[B]][0] : f32 from vector<8xf32>
+//       CHECK:   vector.extract %[[A]][7] : f32 from vector<8xf32>
+//       CHECK:   vector.extract %[[B]][7] : f32 from vector<8xf32>
+func.func @fold_extract_shuffle(%a : vector<8xf32>, %b : vector<8xf32>)
+                                -> (f32, f32, f32, f32) {
+  %shuffle = vector.shuffle %a, %b [0, 8, 7, 15] : vector<8xf32>, vector<8xf32>
+  %e0 = vector.extract %shuffle[0] : f32 from vector<4xf32>
+  %e1 = vector.extract %shuffle[1] : f32 from vector<4xf32>
+  %e2 = vector.extract %shuffle[2] : f32 from vector<4xf32>
+  %e3 = vector.extract %shuffle[3] : f32 from vector<4xf32>
+  return %e0, %e1, %e2, %e3 : f32, f32, f32, f32
+}
 
 // -----
 

@llvmbot
Copy link
Member

llvmbot commented Nov 6, 2024

@llvm/pr-subscribers-mlir-vector

Author: Diego Caballero (dcaballe)

Changes

This PR adds a folder for extracting an element from a vector shuffle. It turns something like:

   %shuffle = vector.shuffle %a, %b [0, 8, 7, 15]
     : vector&lt;8xf32&gt;, vector&lt;8xf32&gt;
   %extract = vector.extract %shuffle[3] : f32 from vector&lt;4xf32&gt;

into:

   %extract = vector.extract %b[7] : f32 from vector&lt;4xf32&gt;

Full diff: https://github.com/llvm/llvm-project/pull/115105.diff

2 Files Affected:

  • (modified) mlir/lib/Dialect/Vector/IR/VectorOps.cpp (+42)
  • (modified) mlir/test/Dialect/Vector/canonicalize.mlir (+18)
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index d8913251e56e9e..723044aa2b66e4 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -1705,6 +1705,46 @@ static Value foldExtractFromBroadcast(ExtractOp extractOp) {
   return extractOp.getResult();
 }
 
+/// Fold extractOp coming from ShuffleOp.
+///
+/// Example:
+///
+///   %shuffle = vector.shuffle %a, %b [0, 8, 7, 15]
+///     : vector<8xf32>, vector<8xf32>
+///   %extract = vector.extract %shuffle[3] : f32 from vector<4xf32>
+/// ->
+///   %extract = vector.extract %b[7] : f32 from vector<4xf32>
+///
+static Value foldExtractFromShuffle(ExtractOp extractOp) {
+  // TODO: Canonicalization for dynamic position not implemented yet.
+  if (extractOp.hasDynamicPosition())
+    return Value();
+
+  auto shuffleOp = extractOp.getVector().getDefiningOp<ShuffleOp>();
+  if (!shuffleOp)
+    return Value();
+
+  // TODO: 0-D or multi-dimensional vectors not supported yet.
+  if (shuffleOp.getResultVectorType().getRank() != 1)
+    return Value();
+
+  int64_t inputVecSize = shuffleOp.getV1().getType().getShape()[0];
+  auto shuffleMask = shuffleOp.getMask();
+  int64_t extractIdx = extractOp.getStaticPosition()[0];
+  int64_t shuffleIdx = shuffleMask[extractIdx];
+
+  // Find the shuffled vector to extract from based on the shuffle index.
+  if (shuffleIdx < inputVecSize) {
+    extractOp.setOperand(0, shuffleOp.getV1());
+    extractOp.setStaticPosition({shuffleIdx});
+  } else {
+    extractOp.setOperand(0, shuffleOp.getV2());
+    extractOp.setStaticPosition({shuffleIdx - inputVecSize});
+  }
+
+  return extractOp.getResult();
+}
+
 // Fold extractOp with source coming from ShapeCast op.
 static Value foldExtractFromShapeCast(ExtractOp extractOp) {
   // TODO: Canonicalization for dynamic position not implemented yet.
@@ -1953,6 +1993,8 @@ OpFoldResult ExtractOp::fold(FoldAdaptor) {
     return res;
   if (auto res = foldExtractFromBroadcast(*this))
     return res;
+  if (auto res = foldExtractFromShuffle(*this))
+    return res;
   if (auto res = foldExtractFromShapeCast(*this))
     return res;
   if (auto val = foldExtractFromExtractStrided(*this))
diff --git a/mlir/test/Dialect/Vector/canonicalize.mlir b/mlir/test/Dialect/Vector/canonicalize.mlir
index df87f86765a3a3..5ae769090dac66 100644
--- a/mlir/test/Dialect/Vector/canonicalize.mlir
+++ b/mlir/test/Dialect/Vector/canonicalize.mlir
@@ -740,6 +740,24 @@ func.func @fold_extract_broadcast(%a : vector<1xf32>) -> vector<8xf32> {
   %r = vector.extract %b[0] : vector<8xf32> from vector<1x8xf32>
   return %r : vector<8xf32>
 }
+// -----
+
+// CHECK-LABEL: @fold_extract_shuffle
+//  CHECK-SAME:   %[[A:.*]]: vector<8xf32>, %[[B:.*]]: vector<8xf32>
+//   CHECK-NOT:   vector.shuffle
+//       CHECK:   vector.extract %[[A]][0] : f32 from vector<8xf32>
+//       CHECK:   vector.extract %[[B]][0] : f32 from vector<8xf32>
+//       CHECK:   vector.extract %[[A]][7] : f32 from vector<8xf32>
+//       CHECK:   vector.extract %[[B]][7] : f32 from vector<8xf32>
+func.func @fold_extract_shuffle(%a : vector<8xf32>, %b : vector<8xf32>)
+                                -> (f32, f32, f32, f32) {
+  %shuffle = vector.shuffle %a, %b [0, 8, 7, 15] : vector<8xf32>, vector<8xf32>
+  %e0 = vector.extract %shuffle[0] : f32 from vector<4xf32>
+  %e1 = vector.extract %shuffle[1] : f32 from vector<4xf32>
+  %e2 = vector.extract %shuffle[2] : f32 from vector<4xf32>
+  %e3 = vector.extract %shuffle[3] : f32 from vector<4xf32>
+  return %e0, %e1, %e2, %e3 : f32, f32, f32, f32
+}
 
 // -----
 

Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I have a question because I'm seeing a TODO. (Please wait for at least a day to see if there are feedback from others.)

Comment on lines 1719 to 1721
// TODO: Canonicalization for dynamic position not implemented yet.
if (extractOp.hasDynamicPosition())
return Value();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, how do we support the dynamic case?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing that comes to my mind is a series of compare and branches?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I'm thinking, and I'm wondering if there are other approaches. This approach looks bad to me because it introduces branching in the IR. If we don't have a good plan, perhaps we should drop the TODO and make an explicit comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO is inherited from similar canonicalizations. Not sure if branches but at least quite some instructions, including more vector extracts and select ops. Definitely not a good canonical form for dynamic cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks! Yes, I also think that it is not a good canonical form. It's good that we're on the same page..

Copy link
Collaborator

@c-rhodes c-rhodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor comment but otherwise LGTM cheers

/// : vector<8xf32>, vector<8xf32>
/// %extract = vector.extract %shuffle[3] : f32 from vector<4xf32>
/// ->
/// %extract = vector.extract %b[7] : f32 from vector<4xf32>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// %extract = vector.extract %b[7] : f32 from vector<4xf32>
/// %extract = vector.extract %b[7] : f32 from vector<8xf32>

Comment on lines 1719 to 1721
// TODO: Canonicalization for dynamic position not implemented yet.
if (extractOp.hasDynamicPosition())
return Value();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only thing that comes to my mind is a series of compare and branches?

Copy link
Contributor

@banach-space banach-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@dcaballe dcaballe merged commit af5c471 into llvm:main Nov 7, 2024
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants