[RISCV] Lower (vector_interleave X, undef) to (vzext_vl X). #87283

topperc · 2024-04-01T21:27:55Z

If the odd vector is undef or poison, the widening add and multiply trick
doesn't work unless we freeze the odd vector.

Unfortunately, freezing doesn't work when the operand is provably
undef/poison. MIR doesn't have a representation for freeze so it
just becomes a COPY from IMPLICIT_DEF which freely propagates undef
to each operand independently.

To work around this, check for undef explicitly and lower to a VZEXT_VL
of the even vector. This produces better code than we'd get from a
freeze anyway.

I've left a FIXME for adding a freeze. I'll do that as a separate patch
as it affects other tests and doesn't help with the new test.

…is literal poison. The interleave lowering relies on a math trick that requires passing the odd vector to two math instructions. In order to be correct these instructions must see the same value. If the odd vector is provably poison or undef, SelectionDAG will create a vwadd and vwmaccu where the operand is a copy from IMPLICIT_DEF. Later this will become just the undef flag on the operand. This gives the register allocator freedom to pick a different register for each instruction.

If the odd vector is undef or poison, the widening add and multiply trick doesn't work unless we freeze the odd vector. Unfortunately, freezing doesn't work when the operand is provably undef/poison. MIR doesn't have a representation for freeze so it just becomes a COPY from IMPLICIT_DEF which freely propagates undef to each operand independently. To work around this, check for undef explicitly and lower to a VZEXT_VL of the even vector. This produces better code than we'd get from a freeze anyway. I've left a FIXME for adding a freeze. I'll do that as a separate patch as it affects other tests and doesn't help with the new test.

llvmbot · 2024-04-01T21:28:28Z

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

If the odd vector is undef or poison, the widening add and multiply trick
doesn't work unless we freeze the odd vector.

Unfortunately, freezing doesn't work when the operand is provably
undef/poison. MIR doesn't have a representation for freeze so it
just becomes a COPY from IMPLICIT_DEF which freely propagates undef
to each operand independently.

To work around this, check for undef explicitly and lower to a VZEXT_VL
of the even vector. This produces better code than we'd get from a
freeze anyway.

I've left a FIXME for adding a freeze. I'll do that as a separate patch
as it affects other tests and doesn't help with the new test.

Full diff: https://github.com/llvm/llvm-project/pull/87283.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+10-1)
(modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll (+21)

diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index f693cbd3bea51e..c37938cd9559f0 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -4624,7 +4624,13 @@ static SDValue getWideningInterleave(SDValue EvenV, SDValue OddV,
   SDValue Passthru = DAG.getUNDEF(WideContainerVT);
 
   SDValue Interleaved;
-  if (Subtarget.hasStdExtZvbb()) {
+  if (OddV.isUndef()) {
+    // If OddV is undef, this is a zero extend.
+    // FIXME: Not only does this optimize the code, it fixes some correctness
+    // issues because MIR does not have freeze.
+    Interleaved = DAG.getNode(RISCVISD::VZEXT_VL, DL, WideContainerVT, EvenV,
+                              Mask, VL);
+  } else if (Subtarget.hasStdExtZvbb()) {
     // Interleaved = (OddV << VecVT.getScalarSizeInBits()) + EvenV.
     SDValue OffsetVec =
         DAG.getSplatVector(VecContainerVT, DL,
@@ -4635,6 +4641,9 @@ static SDValue getWideningInterleave(SDValue EvenV, SDValue OddV,
     Interleaved = DAG.getNode(RISCVISD::VWADDU_W_VL, DL, WideContainerVT,
                               Interleaved, EvenV, Passthru, Mask, VL);
   } else {
+    // FIXME: We should freeze the odd vector here. We already handled the case
+    // of provably undef/poison above.
+
     // Widen EvenV and OddV with 0s and add one copy of OddV to EvenV with
     // vwaddu.vv
     Interleaved = DAG.getNode(RISCVISD::VWADDU_VL, DL, WideContainerVT, EvenV,
diff --git a/llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll b/llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
index 1acc0fec8fe586..7068e044dfc4f4 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll
@@ -656,6 +656,27 @@ define <vscale x 16 x double> @vector_interleave_nxv16f64_nxv8f64(<vscale x 8 x
   ret <vscale x 16 x double> %res
 }
 
+; FIXME: The last operand to the vwaddu.vv and vwmaccu.vx are both undef. They
+; need to be the same register with the same contents. Otherwise, the even
+; elements will not contain just the values from %a.
+define <vscale x 8 x i32> @vector_interleave_nxv8i32_nxv4i32_poison(<vscale x 4 x i32> %a) {
+; CHECK-LABEL: vector_interleave_nxv8i32_nxv4i32_poison:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli a0, zero, e64, m4, ta, ma
+; CHECK-NEXT:    vzext.vf2 v12, v8
+; CHECK-NEXT:    vmv.v.v v8, v12
+; CHECK-NEXT:    ret
+;
+; ZVBB-LABEL: vector_interleave_nxv8i32_nxv4i32_poison:
+; ZVBB:       # %bb.0:
+; ZVBB-NEXT:    vsetvli a0, zero, e64, m4, ta, ma
+; ZVBB-NEXT:    vzext.vf2 v12, v8
+; ZVBB-NEXT:    vmv.v.v v8, v12
+; ZVBB-NEXT:    ret
+  %res = call <vscale x 8 x i32> @llvm.experimental.vector.interleave2.nxv8i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> poison)
+  ret <vscale x 8 x i32> %res
+}
+
 declare <vscale x 64 x half> @llvm.experimental.vector.interleave2.nxv64f16(<vscale x 32 x half>, <vscale x 32 x half>)
 declare <vscale x 32 x float> @llvm.experimental.vector.interleave2.nxv32f32(<vscale x 16 x float>, <vscale x 16 x float>)
 declare <vscale x 16 x double> @llvm.experimental.vector.interleave2.nxv16f64(<vscale x 8 x double>, <vscale x 8 x double>)

github-actions · 2024-04-01T21:30:49Z

✅ With the latest revision this PR passed the C/C++ code formatter.

lukel97

LGTM

kito-cheng · 2024-04-02T12:48:41Z

And maybe (vector_interleave undef, X) to (vsll (vzext_vl X) sew_of_x) for a separate patch?

If the odd vector is undef or poison, the widening add and multiply trick doesn't work unless we freeze the odd vector. Unfortunately, freezing doesn't work when the operand is provably undef/poison. MIR doesn't have a representation for freeze so it just becomes a COPY from IMPLICIT_DEF which freely propagates undef to each operand independently. To work around this, check for undef explicitly and lower to a VZEXT_VL of the even vector. This produces better code than we'd get from a freeze anyway. I've left a FIXME for adding a freeze. I'll do that as a separate patch as it affects other tests and doesn't help with the new test.

topperc · 2024-04-02T18:59:50Z

Committed as 8c1dc5d and a9af66a

preames · 2024-04-10T21:40:23Z

And maybe (vector_interleave undef, X) to (vsll (vzext_vl X) sew_of_x) for a separate patch?

I happened to be looking at this code today, and had a similar realization. If anyone has interest, I think this generalizes significantly.

This tactic can be generalized for an entire family of shuffles of the form:

<a0, zero, a1, zero> -- this case
<zero, a0, zero, a1> -- the vwsll case (also the one kito points out)
<a0, zero, zero, zero, a1, zero, zero, zero> -- the zext.vf4 case

And the VWADD is just a specialization of the ISD::SELECT case when alternating lanes are zero in the two inputs.

Interestingly, the recursive reasoning would nearly make the special case two argument form redundant. The only bit left is the decision to "commit" undef to be zero.

topperc added 2 commits April 1, 2024 14:07

topperc requested review from preames and lukel97 April 1, 2024 21:27

llvmbot added the backend:RISC-V label Apr 1, 2024

fixup! clang-format and remove FIXME from test.

68b6802

lukel97 approved these changes Apr 2, 2024

View reviewed changes

Merge remote-tracking branch 'origin/main' into pr/interleave-poison

0557582

topperc closed this Apr 2, 2024

topperc deleted the pr/interleave-poison branch April 2, 2024 18:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RISCV] Lower (vector_interleave X, undef) to (vzext_vl X). #87283

[RISCV] Lower (vector_interleave X, undef) to (vzext_vl X). #87283

Uh oh!

topperc commented Apr 1, 2024

Uh oh!

llvmbot commented Apr 1, 2024

Uh oh!

github-actions bot commented Apr 1, 2024 •

edited

Loading

Uh oh!

lukel97 left a comment

Uh oh!

kito-cheng commented Apr 2, 2024

Uh oh!

topperc commented Apr 2, 2024

Uh oh!

preames commented Apr 10, 2024

Uh oh!

Uh oh!

[RISCV] Lower (vector_interleave X, undef) to (vzext_vl X). #87283

[RISCV] Lower (vector_interleave X, undef) to (vzext_vl X). #87283

Uh oh!

Conversation

topperc commented Apr 1, 2024

Uh oh!

llvmbot commented Apr 1, 2024

Uh oh!

github-actions bot commented Apr 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

kito-cheng commented Apr 2, 2024

Uh oh!

topperc commented Apr 2, 2024

Uh oh!

preames commented Apr 10, 2024

Uh oh!

Uh oh!

github-actions bot commented Apr 1, 2024 •

edited

Loading