[RISCV] Improve legalization of e8 m8 VL>256 shuffles #79330

preames · 2024-01-24T17:02:01Z

If we can't produce a large enough index vector in i8, we may need to legalize
the shuffle (via scalarization - which in turn gets lowered into stack usage).
This change makes two related changes:

Deferring legalization until we actually need to generate the vrgather
instruction. With the new recursive structure, this only happens when
doing the fallback for one of the arms.
Check the actual mask values for something outside of the representable
range.

Both are covered by recently added tests.

llvmbot · 2024-01-24T17:02:36Z

@llvm/pr-subscribers-backend-risc-v

Author: Philip Reames (preames)

Changes

If we can't produce a large enough index vector in i8, we may need to legalize the shuffle (via splitting or scalarization). We were doing this before sub-dividing, but the actual vselect doesn't have this legality property. If our two arms can be generated without resorting to the vrgather path, we were failing for no reason.

This is a functional change, but I didn't include a test (mostly because the existing logic didn't seem to have a test either.)

Full diff: https://github.com/llvm/llvm-project/pull/79330.diff

1 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+8-8)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index a38352e8e87f21..46d75529a2a151 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4965,14 +4965,6 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
   if (SDValue V = lowerVECTOR_SHUFFLEAsRotate(SVN, DAG, Subtarget))
     return V;
 
-  if (VT.getScalarSizeInBits() == 8 && VT.getVectorNumElements() > 256) {
-    // On such a large vector we're unable to use i8 as the index type.
-    // FIXME: We could promote the index to i16 and use vrgatherei16, but that
-    // may involve vector splitting if we're already at LMUL=8, or our
-    // user-supplied maximum fixed-length LMUL.
-    return SDValue();
-  }
-
   // As a backup, shuffles can be lowered via a vrgather instruction, possibly
   // merged with a second vrgather.
   SmallVector<int> ShuffleMaskLHS, ShuffleMaskRHS;
@@ -5002,6 +4994,14 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
   // single source permutation.  Note that all the splat variants
   // are handled above.
   if (V2.isUndef()) {
+    if (VT.getScalarSizeInBits() == 8 && VT.getVectorNumElements() > 256) {
+      // On such a large vector we're unable to use i8 as the index type.
+      // FIXME: We could promote the index to i16 and use vrgatherei16, but that
+      // may involve vector splitting if we're already at LMUL=8, or our
+      // user-supplied maximum fixed-length LMUL.
+      return SDValue();
+    }
+
     unsigned GatherVVOpc = RISCVISD::VRGATHER_VV_VL;
     MVT IndexVT = VT.changeTypeToInteger();
     // Since we can't introduce illegal index types at this stage, use i16 and

topperc · 2024-01-24T19:56:37Z

The description here is misleading. I believe returning SDValue() here causes the shuffle to get expanded through memory. It will go through the Expand path in LegalizeDAG.

topperc · 2024-01-24T21:26:35Z

The description here is misleading. I believe returning SDValue() here causes the shuffle to get expanded through memory. It will go through the Expand path in LegalizeDAG.

It doesn't go through memory. I guess I was thinking of build_vector. I'm not sure what shuffle expansion does. I guess it uses extracts and_vector?

Triggered by discussion on #79330. In the process of writing this, realized one of my recent refactorings appears to have broken the legalization for the single source case here. Fix to follow in separate patch.

If we're lowering an e8 m8 shuffle and we have an index value greater than 255, we have no available space to generate an e16 index vector. The code had originally handled this correctly, but in a recent refactoring I had moved the single source code above the check, and thus broke the single source by accident. I have a change on review to rework this (#79330), but for now, go with the most obvious fix.

If we can't produce a large enough index vector in i8, we may need to legalize the shuffle (via scalarization - which in turn gets lowered into stack usage). This change makes two related changes: * Defering legalization until we actually need to generate the vrgather instruction. With the new recursive structure, this only happens when doing the fallback for one of the arms. * Check the actual mask values for something outside of the representable range. Both are covered by recently added tests.

github-actions · 2024-01-25T01:37:52Z

✅ With the latest revision this PR passed the C/C++ code formatter.

preames · 2024-01-25T01:39:22Z

I went ahead and added test coverage for both the original code and the new case. In the process, I discovered that one my earlier refactorings actually broke this logic - I think my mental state was assuming the prior version of this patch despite it not having landed - and I landed a separate change to get us back to a working baseline.

Pushed a reworked change, please take a look at the description from scratch.

lukel97

LGTM

topperc · 2024-01-25T23:16:30Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

-    if (VT.getScalarSizeInBits() == 8 && VT.getVectorNumElements() > 256) {
-      // On such a large vector we're unable to use i8 as the index type.
+    if (VT.getScalarSizeInBits() == 8 &&
+        any_of(enumerate(Mask), [&](const auto &Idx) {


Why do we need enumerate?

If we can't produce a large enough index vector in i8, we may need to legalize the shuffle (via scalarization - which in turn gets lowered into stack usage). This change makes two related changes: * Deferring legalization until we actually need to generate the vrgather instruction. With the new recursive structure, this only happens when doing the fallback for one of the arms. * Check the actual mask values for something outside of the representable range. Both are covered by recently added tests.

preames requested review from lukel97 and topperc January 24, 2024 17:02

llvmbot added the backend:RISC-V label Jan 24, 2024

preames force-pushed the pr-riscv-shuffle-legalize-index-after-split branch from fdf385c to dde80b7 Compare January 25, 2024 01:35

preames changed the title ~~[RISCV] Legalize shuffle index after splitting two argument shuffles~~ [RISCV] Improve legalization of e8 m8 VL>256 shuffles Jan 25, 2024

lukel97 approved these changes Jan 25, 2024

View reviewed changes

topperc reviewed Jan 25, 2024

View reviewed changes

Address style comment in review

ce6beaa

preames merged commit ff53d50 into llvm:main Jan 31, 2024

preames deleted the pr-riscv-shuffle-legalize-index-after-split branch January 31, 2024 22:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Improve legalization of e8 m8 VL>256 shuffles #79330

[RISCV] Improve legalization of e8 m8 VL>256 shuffles #79330

Uh oh!

preames commented Jan 24, 2024 •

edited

Loading

Uh oh!

llvmbot commented Jan 24, 2024

Uh oh!

topperc commented Jan 24, 2024

Uh oh!

topperc commented Jan 24, 2024

Uh oh!

github-actions bot commented Jan 25, 2024 •

edited

Loading

Uh oh!

preames commented Jan 25, 2024

Uh oh!

lukel97 left a comment

Uh oh!

topperc Jan 25, 2024

Uh oh!

Uh oh!

[RISCV] Improve legalization of e8 m8 VL>256 shuffles #79330

[RISCV] Improve legalization of e8 m8 VL>256 shuffles #79330

Uh oh!

Conversation

preames commented Jan 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Jan 24, 2024

Uh oh!

topperc commented Jan 24, 2024

Uh oh!

topperc commented Jan 24, 2024

Uh oh!

github-actions bot commented Jan 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

preames commented Jan 25, 2024

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

topperc Jan 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

preames commented Jan 24, 2024 •

edited

Loading

github-actions bot commented Jan 25, 2024 •

edited

Loading