[RISCV] Fix incorrect slide offset when using vnsrl to de-interleave #132123

mshockwave · 2025-03-20T00:20:08Z

Given this shuffle:

shufflevector <8 x i8> %0, <8 x i8> %1, <8 x i32> <i32 0, i32 4, i32 8, i32 12, i32 undef, i32 undef, i32 undef, i32 undef>

#127272 lowers it with a bunch of vnsrl. If we describe the result in terms of the shuffle mask, we expect:

<0, 4, 8, 12, u, u, u, u>

but we actually got:

<0, 4, u, u, 8, 12, u, u>

for factor larger than 2. This is caused by CONCAT_VECTORS on incorrect (sub) vector types. This patch fixes the said issue by building an aggregate vector with the correct sub vector types.

Fix #132071

Fix llvm#132071

llvmbot · 2025-03-20T00:20:41Z

@llvm/pr-subscribers-backend-risc-v

Author: Min-Yih Hsu (mshockwave)

Changes

Given this shuffle:

shufflevector &lt;8 x i8&gt; %0, &lt;8 x i8&gt; %1, &lt;8 x i32&gt; &lt;i32 0, i32 4, i32 8, i32 12, i32 undef, i32 undef, i32 undef, i32 undef&gt;

#127272 lowers it with a bunch of vnsrl. If we describe the result in terms of the shuffle mask, we expect:

&lt;0, 4, 8, 12, u, u, u, u&gt;

but we actually got:

&lt;0, 4, u, u, 8, 12, u, u&gt;

for factor larger than 2. This is caused by CONCAT_VECTORS on incorrect (sub) vector types. This patch fixes the said issue by using the correct sub vector types and recursively concat them with UNDEF when needed.

Fix #132071

Full diff: https://github.com/llvm/llvm-project/pull/132123.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+17-4)
(modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave.ll (+4-2)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 48d8fc23dc1bb..4a0de99428605 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -5545,12 +5545,25 @@ static SDValue lowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG,
             })) {
           // Narrow each source and concatenate them.
           // FIXME: For small LMUL it is better to concatenate first.
-          MVT HalfVT = VT.getHalfNumVectorElementsVT();
+          MVT EltVT = VT.getVectorElementType();
+          auto EltCnt = VT.getVectorElementCount();
+          MVT SubVT =
+              MVT::getVectorVT(EltVT, EltCnt.divideCoefficientBy(Factor));
+
           SDValue Lo =
-              getDeinterleaveShiftAndTrunc(DL, HalfVT, V1, Factor, Index, DAG);
+              getDeinterleaveShiftAndTrunc(DL, SubVT, V1, Factor, Index, DAG);
           SDValue Hi =
-              getDeinterleaveShiftAndTrunc(DL, HalfVT, V2, Factor, Index, DAG);
-          return DAG.getNode(ISD::CONCAT_VECTORS, DL, VT, Lo, Hi);
+              getDeinterleaveShiftAndTrunc(DL, SubVT, V2, Factor, Index, DAG);
+
+          MVT NewVT = SubVT.getDoubleNumVectorElementsVT();
+          SDValue Concat = DAG.getNode(ISD::CONCAT_VECTORS, DL, NewVT, Lo, Hi);
+          for (unsigned F = Factor; F > 2; F >>= 1) {
+            SDValue Undef = DAG.getUNDEF(NewVT);
+            NewVT = NewVT.getDoubleNumVectorElementsVT();
+            Concat = DAG.getNode(ISD::CONCAT_VECTORS, DL, NewVT, Concat, Undef);
+          }
+
+          return Concat;
         }
       }
     }
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave.ll
index 5e6d7c1eedb76..0b4231cedcab5 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave.ll
@@ -378,8 +378,9 @@ define void @deinterleave4_0_i8_two_source(ptr %in0, ptr %in1, ptr %out) {
 ; CHECK-NEXT:    vsetvli zero, zero, e8, mf8, ta, ma
 ; CHECK-NEXT:    vnsrl.wi v8, v8, 0
 ; CHECK-NEXT:    vnsrl.wi v9, v9, 0
+; CHECK-NEXT:    vsetivli zero, 4, e8, mf4, ta, ma
+; CHECK-NEXT:    vslideup.vi v9, v8, 2
 ; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
-; CHECK-NEXT:    vslideup.vi v9, v8, 4
 ; CHECK-NEXT:    vse8.v v9, (a2)
 ; CHECK-NEXT:    ret
 entry:
@@ -402,8 +403,9 @@ define void @deinterleave4_8_i8_two_source(ptr %in0, ptr %in1, ptr %out) {
 ; CHECK-NEXT:    vsetvli zero, zero, e8, mf8, ta, ma
 ; CHECK-NEXT:    vnsrl.wi v8, v8, 0
 ; CHECK-NEXT:    vnsrl.wi v9, v9, 0
+; CHECK-NEXT:    vsetivli zero, 4, e8, mf4, ta, ma
+; CHECK-NEXT:    vslideup.vi v9, v8, 2
 ; CHECK-NEXT:    vsetivli zero, 8, e8, mf2, ta, ma
-; CHECK-NEXT:    vslideup.vi v9, v8, 4
 ; CHECK-NEXT:    vse8.v v9, (a2)
 ; CHECK-NEXT:    ret
 entry:

topperc · 2025-03-20T00:24:18Z

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

+          for (unsigned F = Factor; F > 2; F >>= 1) {
+            SDValue Undef = DAG.getUNDEF(NewVT);
+            NewVT = NewVT.getDoubleNumVectorElementsVT();
+            Concat = DAG.getNode(ISD::CONCAT_VECTORS, DL, NewVT, Concat, Undef);


Are we just slowly building an INSERT_SUBVECTOR?

Are we just slowly building an INSERT_SUBVECTOR?

The CONCAT_VECTORS lowering code does a similar thing on CONCAT_VECTORS with more than 2 operand which turns it into a tree of CONCAT_VECTORS. But I don't think this tree of CONCAT_VECTORS will become a single INSERT_SUBVECTOR.

Can we use INSERT_SUBVECTOR here?

Can we use INSERT_SUBVECTOR here?

Using INSERT_SUBVECTOR is indeed simpler. It's done know.

topperc · 2025-03-20T04:24:16Z

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave.ll

@@ -378,7 +378,7 @@ define void @deinterleave4_0_i8_two_source(ptr %in0, ptr %in1, ptr %out) {
 ; CHECK-NEXT:    vsetvli zero, zero, e8, mf8, ta, ma
 ; CHECK-NEXT:    vnsrl.wi v8, v8, 0
 ; CHECK-NEXT:    vnsrl.wi v9, v9, 0
-; CHECK-NEXT:    vsetivli zero, 4, e8, mf4, ta, ma
+; CHECK-NEXT:    vsetivli zero, 4, e8, mf2, tu, ma


Would it be better to CONCAT_VECTORS the Lo and Hi using a 2x type, then widen the rest of the way with INSERT_SUBVECTOR?

Yes it's better (no tu). It's fixed now.

topperc

LGTM

preames

LGTM

[RISCV] Fix incorrect slide offset when using vnsrl to de-interleave

d298e8c

Fix llvm#132071

mshockwave requested a review from topperc March 20, 2025 00:20

llvmbot added the backend:RISC-V label Mar 20, 2025

topperc requested a review from preames March 20, 2025 00:21

mshockwave mentioned this pull request Mar 20, 2025

[RISC-V] Miscompile on rv64gcv with -O[23] #132071

Closed

topperc reviewed Mar 20, 2025

View reviewed changes

fixup! Use INSERT_SUBVECTOR intead of using many CONCAT_VECTORS

1400fa9

topperc reviewed Mar 20, 2025

View reviewed changes

fixup! Address review comments

d1146cf

topperc approved these changes Mar 20, 2025

View reviewed changes

preames approved these changes Mar 20, 2025

View reviewed changes

mshockwave merged commit 03ceb26 into llvm:main Mar 20, 2025
11 checks passed

mshockwave deleted the patch/fix-132071 branch March 20, 2025 16:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Fix incorrect slide offset when using vnsrl to de-interleave #132123

[RISCV] Fix incorrect slide offset when using vnsrl to de-interleave #132123

Uh oh!

mshockwave commented Mar 20, 2025 •

edited

Loading

Uh oh!

llvmbot commented Mar 20, 2025

Uh oh!

topperc Mar 20, 2025

Uh oh!

mshockwave Mar 20, 2025 •

edited

Loading

Uh oh!

topperc Mar 20, 2025

Uh oh!

mshockwave Mar 20, 2025

Uh oh!

topperc Mar 20, 2025

Uh oh!

mshockwave Mar 20, 2025

Uh oh!

topperc left a comment

Uh oh!

preames left a comment

Uh oh!

Uh oh!

Uh oh!

[RISCV] Fix incorrect slide offset when using vnsrl to de-interleave #132123

[RISCV] Fix incorrect slide offset when using vnsrl to de-interleave #132123

Uh oh!

Conversation

mshockwave commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 20, 2025

Uh oh!

topperc Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

mshockwave Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topperc Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

mshockwave Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

topperc Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

mshockwave Mar 20, 2025

Choose a reason for hiding this comment

Uh oh!

topperc left a comment

Choose a reason for hiding this comment

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mshockwave commented Mar 20, 2025 •

edited

Loading

mshockwave Mar 20, 2025 •

edited

Loading