Skip to content

[IR][RISCV] Add llvm.vector.(de)interleave3/5/7 #124825

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 5, 2025

Conversation

mshockwave
Copy link
Member

These three intrinsics are similar to llvm.vector.(de)interleave2 but work with 3/5/7 vector operands or results.
For RISC-V, it's important to have them in order to support segmented load/store with factor of 2 to 8: factor of 2/4/8 can be synthesized from (de)interleave2; factor of 6 can be synthesized from factor of 2 and 3; factor 5 and 7 have their own intrinsics added by this patch.

This patch only adds codegen support for these intrinsics, we still need to teach vectorizer to generate them as well as teaching InterleavedAccessPass to use them.

These three intrinsics are similar to llvm.vector.(de)interleave2 but
work with 3/5/7 vector operands or results.
For RISC-V, it's important to have them in order to support segmented
load/store with factor of 2 to 8: factor of 2/4/8 can be synthesized
from (de)interleave2; factor of 6 can be synthesized from factor of 2
and 3; factor 5 and 7 have their own intrinsics added by this patch.

This patch only adds codegen support for these intrinsics, we still need
to teach vectorizer to generate them as well as teaching
InterleavedAccessPass to use them.

Co-Authored-By: Craig Topper <[email protected]>
@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-risc-v

Author: Min-Yih Hsu (mshockwave)

Changes

These three intrinsics are similar to llvm.vector.(de)interleave2 but work with 3/5/7 vector operands or results.
For RISC-V, it's important to have them in order to support segmented load/store with factor of 2 to 8: factor of 2/4/8 can be synthesized from (de)interleave2; factor of 6 can be synthesized from factor of 2 and 3; factor 5 and 7 have their own intrinsics added by this patch.

This patch only adds codegen support for these intrinsics, we still need to teach vectorizer to generate them as well as teaching InterleavedAccessPass to use them.


Patch is 302.52 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/124825.diff

11 Files Affected:

  • (modified) llvm/include/llvm/IR/DerivedTypes.h (+9)
  • (modified) llvm/include/llvm/IR/Intrinsics.h (+9-4)
  • (modified) llvm/include/llvm/IR/Intrinsics.td (+60)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp (+12-8)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (+48-20)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+58-24)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h (+2-2)
  • (modified) llvm/lib/IR/Intrinsics.cpp (+34)
  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+183-69)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-deinterleave.ll (+4093-24)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll (+2859-25)
diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h
index b44f4f8c8687dc..60606d34c32c31 100644
--- a/llvm/include/llvm/IR/DerivedTypes.h
+++ b/llvm/include/llvm/IR/DerivedTypes.h
@@ -536,6 +536,15 @@ class VectorType : public Type {
                            EltCnt.divideCoefficientBy(2));
   }
 
+  static VectorType *getOneNthElementsVectorType(VectorType *VTy,
+                                                 unsigned Denominator) {
+    auto EltCnt = VTy->getElementCount();
+    assert(EltCnt.isKnownMultipleOf(Denominator) &&
+           "Cannot take one-nth of a vector");
+    return VectorType::get(VTy->getScalarType(),
+                           EltCnt.divideCoefficientBy(Denominator));
+  }
+
   /// This static method returns a VectorType with twice as many elements as the
   /// input type and the same element type.
   static VectorType *getDoubleElementsVectorType(VectorType *VTy) {
diff --git a/llvm/include/llvm/IR/Intrinsics.h b/llvm/include/llvm/IR/Intrinsics.h
index 82f72131b9d2f4..a6f243a2d98798 100644
--- a/llvm/include/llvm/IR/Intrinsics.h
+++ b/llvm/include/llvm/IR/Intrinsics.h
@@ -148,6 +148,9 @@ namespace Intrinsic {
       ExtendArgument,
       TruncArgument,
       HalfVecArgument,
+      OneThirdVecArgument,
+      OneFifthVecArgument,
+      OneSeventhVecArgument,
       SameVecWidthArgument,
       VecOfAnyPtrsToElt,
       VecElementArgument,
@@ -178,15 +181,17 @@ namespace Intrinsic {
     unsigned getArgumentNumber() const {
       assert(Kind == Argument || Kind == ExtendArgument ||
              Kind == TruncArgument || Kind == HalfVecArgument ||
-             Kind == SameVecWidthArgument || Kind == VecElementArgument ||
-             Kind == Subdivide2Argument || Kind == Subdivide4Argument ||
-             Kind == VecOfBitcastsToInt);
+             Kind == OneThirdVecArgument || Kind == OneFifthVecArgument ||
+             Kind == OneSeventhVecArgument || Kind == SameVecWidthArgument ||
+             Kind == VecElementArgument || Kind == Subdivide2Argument ||
+             Kind == Subdivide4Argument || Kind == VecOfBitcastsToInt);
       return Argument_Info >> 3;
     }
     ArgKind getArgumentKind() const {
       assert(Kind == Argument || Kind == ExtendArgument ||
              Kind == TruncArgument || Kind == HalfVecArgument ||
-             Kind == SameVecWidthArgument ||
+             Kind == OneThirdVecArgument || Kind == OneFifthVecArgument ||
+             Kind == OneSeventhVecArgument || Kind == SameVecWidthArgument ||
              Kind == VecElementArgument || Kind == Subdivide2Argument ||
              Kind == Subdivide4Argument || Kind == VecOfBitcastsToInt);
       return (ArgKind)(Argument_Info & 7);
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index ee877349a33149..3597400df9b771 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -300,6 +300,8 @@ def IIT_V1 : IIT_Vec<1, 28>;
 def IIT_VARARG : IIT_VT<isVoid, 29>;
 def IIT_HALF_VEC_ARG : IIT_Base<30>;
 def IIT_SAME_VEC_WIDTH_ARG : IIT_Base<31>;
+def IIT_ONE_THIRD_VEC_ARG : IIT_Base<32>;
+def IIT_ONE_FIFTH_VEC_ARG : IIT_Base<33>;
 def IIT_VEC_OF_ANYPTRS_TO_ELT : IIT_Base<34>;
 def IIT_I128 : IIT_Int<128, 35>;
 def IIT_V512 : IIT_Vec<512, 36>;
@@ -327,6 +329,7 @@ def IIT_I4 : IIT_Int<4, 58>;
 def IIT_AARCH64_SVCOUNT : IIT_VT<aarch64svcount, 59>;
 def IIT_V6 : IIT_Vec<6, 60>;
 def IIT_V10 : IIT_Vec<10, 61>;
+def IIT_ONE_SEVENTH_VEC_ARG : IIT_Base<62>;
 }
 
 defvar IIT_all_FixedTypes = !filter(iit, IIT_all,
@@ -467,6 +470,15 @@ class LLVMVectorElementType<int num> : LLVMMatchType<num, IIT_VEC_ELEMENT>;
 class LLVMHalfElementsVectorType<int num>
   : LLVMMatchType<num, IIT_HALF_VEC_ARG>;
 
+class LLVMOneThirdElementsVectorType<int num>
+  : LLVMMatchType<num, IIT_ONE_THIRD_VEC_ARG>;
+
+class LLVMOneFifthElementsVectorType<int num>
+  : LLVMMatchType<num, IIT_ONE_FIFTH_VEC_ARG>;
+
+class LLVMOneSeventhElementsVectorType<int num>
+  : LLVMMatchType<num, IIT_ONE_SEVENTH_VEC_ARG>;
+
 // Match the type of another intrinsic parameter that is expected to be a
 // vector type (i.e. <N x iM>) but with each element subdivided to
 // form a vector with more elements that are smaller than the original.
@@ -2728,6 +2740,54 @@ def int_vector_deinterleave2 : DefaultAttrsIntrinsic<[LLVMHalfElementsVectorType
                                                      [llvm_anyvector_ty],
                                                      [IntrNoMem]>;
 
+def int_vector_interleave3   : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
+                                                     [LLVMOneThirdElementsVectorType<0>,
+                                                      LLVMOneThirdElementsVectorType<0>,
+                                                      LLVMOneThirdElementsVectorType<0>],
+                                                     [IntrNoMem]>;
+
+def int_vector_deinterleave3 : DefaultAttrsIntrinsic<[LLVMOneThirdElementsVectorType<0>,
+                                                      LLVMOneThirdElementsVectorType<0>,
+                                                      LLVMOneThirdElementsVectorType<0>],
+                                                     [llvm_anyvector_ty],
+                                                     [IntrNoMem]>;
+
+def int_vector_interleave5   : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
+                                                     [LLVMOneFifthElementsVectorType<0>,
+                                                      LLVMOneFifthElementsVectorType<0>,
+                                                      LLVMOneFifthElementsVectorType<0>,
+                                                      LLVMOneFifthElementsVectorType<0>,
+                                                      LLVMOneFifthElementsVectorType<0>],
+                                                     [IntrNoMem]>;
+
+def int_vector_deinterleave5 : DefaultAttrsIntrinsic<[LLVMOneFifthElementsVectorType<0>,
+                                                      LLVMOneFifthElementsVectorType<0>,
+                                                      LLVMOneFifthElementsVectorType<0>,
+                                                      LLVMOneFifthElementsVectorType<0>,
+                                                      LLVMOneFifthElementsVectorType<0>],
+                                                     [llvm_anyvector_ty],
+                                                     [IntrNoMem]>;
+
+def int_vector_interleave7   : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
+                                                     [LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>],
+                                                     [IntrNoMem]>;
+
+def int_vector_deinterleave7 : DefaultAttrsIntrinsic<[LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>,
+                                                      LLVMOneSeventhElementsVectorType<0>],
+                                                     [llvm_anyvector_ty],
+                                                     [IntrNoMem]>;
+
 //===-------------- Intrinsics to perform partial reduction ---------------===//
 
 def int_experimental_vector_partial_reduce_add : DefaultAttrsIntrinsic<[LLVMMatchType<0>],
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
index b0a624680231e9..c95f7b7eb8dec3 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
@@ -5825,15 +5825,19 @@ SDValue DAGTypeLegalizer::PromoteIntRes_VECTOR_SPLICE(SDNode *N) {
 }
 
 SDValue DAGTypeLegalizer::PromoteIntRes_VECTOR_INTERLEAVE_DEINTERLEAVE(SDNode *N) {
-  SDLoc dl(N);
+  SDLoc DL(N);
+  unsigned Factor = N->getNumOperands();
+
+  SmallVector<SDValue, 8> Ops(Factor);
+  for (unsigned i = 0; i != Factor; i++)
+    Ops[i] = GetPromotedInteger(N->getOperand(i));
+
+  SmallVector<EVT, 8> ResVTs(Factor, Ops[0].getValueType());
+  SDValue Res = DAG.getNode(N->getOpcode(), DL, DAG.getVTList(ResVTs), Ops);
+
+  for (unsigned i = 0; i != Factor; i++)
+    SetPromotedInteger(SDValue(N, i), Res.getValue(i));
 
-  SDValue V0 = GetPromotedInteger(N->getOperand(0));
-  SDValue V1 = GetPromotedInteger(N->getOperand(1));
-  EVT ResVT = V0.getValueType();
-  SDValue Res = DAG.getNode(N->getOpcode(), dl,
-                            DAG.getVTList(ResVT, ResVT), V0, V1);
-  SetPromotedInteger(SDValue(N, 0), Res.getValue(0));
-  SetPromotedInteger(SDValue(N, 1), Res.getValue(1));
   return SDValue();
 }
 
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index f39d9ca15496a9..03d0298e99ad4d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -1668,6 +1668,15 @@ void DAGTypeLegalizer::SplitVecRes_INSERT_SUBVECTOR(SDNode *N, SDValue &Lo,
     return;
   }
 
+  if (getTypeAction(SubVecVT) == TargetLowering::TypeWidenVector &&
+      Vec.isUndef() && SubVecVT.getVectorElementType() == MVT::i1) {
+    SDValue WideSubVec = GetWidenedVector(SubVec);
+    if (WideSubVec.getValueType() == VecVT) {
+      std::tie(Lo, Hi) = DAG.SplitVector(WideSubVec, SDLoc(WideSubVec));
+      return;
+    }
+  }
+
   // Spill the vector to the stack.
   // In cases where the vector is illegal it will be broken down into parts
   // and stored in parts - we should use the alignment for the smallest part.
@@ -3183,34 +3192,53 @@ void DAGTypeLegalizer::SplitVecRes_VP_REVERSE(SDNode *N, SDValue &Lo,
 }
 
 void DAGTypeLegalizer::SplitVecRes_VECTOR_DEINTERLEAVE(SDNode *N) {
+  unsigned Factor = N->getNumOperands();
+
+  SmallVector<SDValue, 8> Ops(Factor * 2);
+  for (unsigned i = 0; i != Factor; ++i) {
+    SDValue OpLo, OpHi;
+    GetSplitVector(N->getOperand(i), OpLo, OpHi);
+    Ops[i * 2] = OpLo;
+    Ops[i * 2 + 1] = OpHi;
+  }
+
+  SmallVector<EVT, 8> VTs(Factor, Ops[0].getValueType());
 
-  SDValue Op0Lo, Op0Hi, Op1Lo, Op1Hi;
-  GetSplitVector(N->getOperand(0), Op0Lo, Op0Hi);
-  GetSplitVector(N->getOperand(1), Op1Lo, Op1Hi);
-  EVT VT = Op0Lo.getValueType();
   SDLoc DL(N);
-  SDValue ResLo = DAG.getNode(ISD::VECTOR_DEINTERLEAVE, DL,
-                              DAG.getVTList(VT, VT), Op0Lo, Op0Hi);
-  SDValue ResHi = DAG.getNode(ISD::VECTOR_DEINTERLEAVE, DL,
-                              DAG.getVTList(VT, VT), Op1Lo, Op1Hi);
+  SDValue ResLo = DAG.getNode(ISD::VECTOR_DEINTERLEAVE, DL, VTs,
+                              ArrayRef(Ops).slice(0, Factor));
+  SDValue ResHi = DAG.getNode(ISD::VECTOR_DEINTERLEAVE, DL, VTs,
+                              ArrayRef(Ops).slice(Factor, Factor));
 
-  SetSplitVector(SDValue(N, 0), ResLo.getValue(0), ResHi.getValue(0));
-  SetSplitVector(SDValue(N, 1), ResLo.getValue(1), ResHi.getValue(1));
+  for (unsigned i = 0; i != Factor; ++i)
+    SetSplitVector(SDValue(N, i), ResLo.getValue(i), ResHi.getValue(i));
 }
 
 void DAGTypeLegalizer::SplitVecRes_VECTOR_INTERLEAVE(SDNode *N) {
-  SDValue Op0Lo, Op0Hi, Op1Lo, Op1Hi;
-  GetSplitVector(N->getOperand(0), Op0Lo, Op0Hi);
-  GetSplitVector(N->getOperand(1), Op1Lo, Op1Hi);
-  EVT VT = Op0Lo.getValueType();
+  unsigned Factor = N->getNumOperands();
+
+  SmallVector<SDValue, 8> Ops(Factor * 2);
+  for (unsigned i = 0; i != Factor; ++i) {
+    SDValue OpLo, OpHi;
+    GetSplitVector(N->getOperand(i), OpLo, OpHi);
+    Ops[i] = OpLo;
+    Ops[i + Factor] = OpHi;
+  }
+
+  SmallVector<EVT, 8> VTs(Factor, Ops[0].getValueType());
+
   SDLoc DL(N);
-  SDValue Res[] = {DAG.getNode(ISD::VECTOR_INTERLEAVE, DL,
-                               DAG.getVTList(VT, VT), Op0Lo, Op1Lo),
-                   DAG.getNode(ISD::VECTOR_INTERLEAVE, DL,
-                               DAG.getVTList(VT, VT), Op0Hi, Op1Hi)};
+  SDValue Res[] = {DAG.getNode(ISD::VECTOR_INTERLEAVE, DL, VTs,
+                               ArrayRef(Ops).slice(0, Factor)),
+                   DAG.getNode(ISD::VECTOR_INTERLEAVE, DL, VTs,
+                               ArrayRef(Ops).slice(Factor, Factor))};
 
-  SetSplitVector(SDValue(N, 0), Res[0].getValue(0), Res[0].getValue(1));
-  SetSplitVector(SDValue(N, 1), Res[1].getValue(0), Res[1].getValue(1));
+  for (unsigned i = 0; i != Factor; ++i) {
+    unsigned IdxLo = 2 * i;
+    unsigned IdxHi = 2 * i + 1;
+    SetSplitVector(SDValue(N, i), Res[IdxLo / Factor].getValue(IdxLo % Factor),
+                   Res[IdxHi / Factor].getValue(IdxHi % Factor));
+  }
 }
 
 //===----------------------------------------------------------------------===//
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 428e7a316d247b..6867944b5d8b4a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -8251,10 +8251,28 @@ void SelectionDAGBuilder::visitIntrinsicCall(const CallInst &I,
     visitCallBrLandingPad(I);
     return;
   case Intrinsic::vector_interleave2:
-    visitVectorInterleave(I);
+    visitVectorInterleave(I, 2);
+    return;
+  case Intrinsic::vector_interleave3:
+    visitVectorInterleave(I, 3);
+    return;
+  case Intrinsic::vector_interleave5:
+    visitVectorInterleave(I, 5);
+    return;
+  case Intrinsic::vector_interleave7:
+    visitVectorInterleave(I, 7);
     return;
   case Intrinsic::vector_deinterleave2:
-    visitVectorDeinterleave(I);
+    visitVectorDeinterleave(I, 2);
+    return;
+  case Intrinsic::vector_deinterleave3:
+    visitVectorDeinterleave(I, 3);
+    return;
+  case Intrinsic::vector_deinterleave5:
+    visitVectorDeinterleave(I, 5);
+    return;
+  case Intrinsic::vector_deinterleave7:
+    visitVectorDeinterleave(I, 7);
     return;
   case Intrinsic::experimental_vector_compress:
     setValue(&I, DAG.getNode(ISD::VECTOR_COMPRESS, sdl,
@@ -12565,26 +12583,31 @@ void SelectionDAGBuilder::visitVectorReverse(const CallInst &I) {
   setValue(&I, DAG.getVectorShuffle(VT, DL, V, DAG.getUNDEF(VT), Mask));
 }
 
-void SelectionDAGBuilder::visitVectorDeinterleave(const CallInst &I) {
+void SelectionDAGBuilder::visitVectorDeinterleave(const CallInst &I,
+                                                  unsigned Factor) {
   auto DL = getCurSDLoc();
   SDValue InVec = getValue(I.getOperand(0));
-  EVT OutVT =
-      InVec.getValueType().getHalfNumVectorElementsVT(*DAG.getContext());
 
+  SmallVector<EVT, 4> ValueVTs;
+  ComputeValueVTs(DAG.getTargetLoweringInfo(), DAG.getDataLayout(), I.getType(),
+                  ValueVTs);
+
+  EVT OutVT = ValueVTs[0];
   unsigned OutNumElts = OutVT.getVectorMinNumElements();
 
-  // ISD Node needs the input vectors split into two equal parts
-  SDValue Lo = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, OutVT, InVec,
-                           DAG.getVectorIdxConstant(0, DL));
-  SDValue Hi = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, OutVT, InVec,
-                           DAG.getVectorIdxConstant(OutNumElts, DL));
+  SmallVector<SDValue, 4> SubVecs(Factor);
+  for (unsigned i = 0; i != Factor; ++i) {
+    assert(ValueVTs[i] == OutVT && "Expected VTs to be the same");
+    SubVecs[i] = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, OutVT, InVec,
+                             DAG.getVectorIdxConstant(OutNumElts * i, DL));
+  }
 
   // Use VECTOR_SHUFFLE for fixed-length vectors to benefit from existing
   // legalisation and combines.
-  if (OutVT.isFixedLengthVector()) {
-    SDValue Even = DAG.getVectorShuffle(OutVT, DL, Lo, Hi,
+  if (OutVT.isFixedLengthVector() && Factor == 2) {
+    SDValue Even = DAG.getVectorShuffle(OutVT, DL, SubVecs[0], SubVecs[1],
                                         createStrideMask(0, 2, OutNumElts));
-    SDValue Odd = DAG.getVectorShuffle(OutVT, DL, Lo, Hi,
+    SDValue Odd = DAG.getVectorShuffle(OutVT, DL, SubVecs[0], SubVecs[1],
                                        createStrideMask(1, 2, OutNumElts));
     SDValue Res = DAG.getMergeValues({Even, Odd}, getCurSDLoc());
     setValue(&I, Res);
@@ -12592,32 +12615,43 @@ void SelectionDAGBuilder::visitVectorDeinterleave(const CallInst &I) {
   }
 
   SDValue Res = DAG.getNode(ISD::VECTOR_DEINTERLEAVE, DL,
-                            DAG.getVTList(OutVT, OutVT), Lo, Hi);
+                            DAG.getVTList(ValueVTs), SubVecs);
   setValue(&I, Res);
 }
 
-void SelectionDAGBuilder::visitVectorInterleave(const CallInst &I) {
+void SelectionDAGBuilder::visitVectorInterleave(const CallInst &I,
+                                                unsigned Factor) {
   auto DL = getCurSDLoc();
-  EVT InVT = getValue(I.getOperand(0)).getValueType();
-  SDValue InVec0 = getValue(I.getOperand(0));
-  SDValue InVec1 = getValue(I.getOperand(1));
   const TargetLowering &TLI = DAG.getTargetLoweringInfo();
+  EVT InVT = getValue(I.getOperand(0)).getValueType();
   EVT OutVT = TLI.getValueType(DAG.getDataLayout(), I.getType());
 
+  SmallVector<SDValue, 8> InVecs(Factor);
+  for (unsigned i = 0; i < Factor; ++i) {
+    InVecs[i] = getValue(I.getOperand(i));
+    assert(InVecs[i].getValueType() == InVecs[0].getValueType() &&
+           "Expected VTs to be the same");
+  }
+
   // Use VECTOR_SHUFFLE for fixed-length vectors to benefit from existing
   // legalisation and combines.
-  if (OutVT.isFixedLengthVector()) {
+  if (OutVT.isFixedLengthVector() && Factor == 2) {
     unsigned NumElts = InVT.getVectorMinNumElements();
-    SDValue V = DAG.getNode(ISD::CONCAT_VECTORS, DL, OutVT, InVec0, InVec1);
+    SDValue V = DAG.getNode(ISD::CONCAT_VECTORS, DL, OutVT, InVecs);
     setValue(&I, DAG.getVectorShuffle(OutVT, DL, V, DAG.getUNDEF(OutVT),
                                       createInterleaveMask(NumElts, 2)));
     return;
   }
 
-  SDValue Res = DAG.getNode(ISD::VECTOR_INTERLEAVE, DL,
-                            DAG.getVTList(InVT, InVT), InVec0, InVec1);
-  Res = DAG.getNode(ISD::CONCAT_VECTORS, DL, OutVT, Res.getValue(0),
-                    Res.getValue(1));
+  SmallVector<EVT, 8> ValueVTs(Factor, InVT);
+  SDValue Res =
+      DAG.getNode(ISD::VECTOR_INTERLEAVE, DL, DAG.getVTList(ValueVTs), InVecs);
+
+  SmallVector<SDValue, 8> Results(Factor);
+  for (unsigned i = 0; i < Factor; ++i)
+    Results[i] = Res.getValue(i);
+
+  Res = DAG.getNode(ISD::CONCAT_VECTORS, DL, OutVT, Results);
   setValue(&I, Res);
 }
 
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
index ed85deef64fa79..ece48c9bedf722 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h
@@ -659,8 +659,8 @@ class SelectionDAGBuilder {
   void visitVectorReduce(const CallInst &I, unsigned Intrinsic);
   void visitVectorReverse(const CallInst &I);
   void visitVectorSplice(const CallInst &I);
-  void visitVectorInterleave(const CallInst &I);
-  void visitVectorDeinterleave(const CallInst &I);
+  void visitVectorInterleave(const CallInst &I, unsigned Factor);
+  void visitVectorDeinterleave(const CallInst &I, unsigned Factor);
   void visitStepVector(const CallInst &I);
 
   void visitUserOp1(const Instruction &I) {
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index ec1184e8d835d6..107caebede1391 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -362,6 +362,24 @@ DecodeIITType(unsigned &NextElt, ArrayRef<unsigned char> In...
[truncated]

auto [Mask, VL] = getDefaultScalableVLOps(ConcatVT, DL, DAG, Subtarget);
SDValue Passthru = DAG.getUNDEF(ConcatVT);

// For the indices, use the same SEW to avoid an extra vsetvli
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to worry about there being too many elements for SEW=8 to represent the indices? I wrote this code, but I can't figure out how that's not an issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably can use vrgatherei16.vv here just to be safe, because the longest (legal) type the concatenated vector can have is SEW=8 + LMUL=8, whose VLMAX can be safely put in 16-bit integer. Also, lowerVECTOR_INTERLEAVE is already using vrtahterhei16.vv

Copy link
Member Author

@mshockwave mshockwave Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably can use vrgatherei16.vv here just to be safe, because the longest (legal) type the concatenated vector can have is SEW=8 + LMUL=8, whose VLMAX can be safely put in 16-bit integer.

Well...16-bit element can represent all 65536 indices, which only happens when data operand is SEW=8 + LMUL=8, but in that case the EMUL of the index operand would be invalid (because EMUL = (16/SEW) * LMUL). I guess we also need to spill to the stack and load them back with segmented store and load

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that would require a LMUL=16 vid.v to create the SEW=16 indices

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the logics here to use unit-stride store + segmented load instead.

@mshockwave mshockwave mentioned this pull request Jan 28, 2025
16 tasks
assert(ValueVTs[i] == OutVT && "Expected VTs to be the same");
SubVecs[i] = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, OutVT, InVec,
DAG.getVectorIdxConstant(OutNumElts * i, DL));
}

// Use VECTOR_SHUFFLE for fixed-length vectors to benefit from existing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment needs updating

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

assert(InVecs[i].getValueType() == InVecs[0].getValueType() &&
"Expected VTs to be the same");
}

// Use VECTOR_SHUFFLE for fixed-length vectors to benefit from existing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment needs updating

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

auto [Mask, VL] = getDefaultScalableVLOps(ConcatVT, DL, DAG, Subtarget);
SDValue Passthru = DAG.getUNDEF(ConcatVT);

// For the indices, use the same SEW to avoid an extra vsetvli
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that would require a LMUL=16 vid.v to create the SEW=16 indices

@@ -300,6 +300,8 @@ def IIT_V1 : IIT_Vec<1, 28>;
def IIT_VARARG : IIT_VT<isVoid, 29>;
def IIT_HALF_VEC_ARG : IIT_Base<30>;
def IIT_SAME_VEC_WIDTH_ARG : IIT_Base<31>;
def IIT_ONE_THIRD_VEC_ARG : IIT_Base<32>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we make them consecutive?
(I don't know why we have a bubble between 31 and 34...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I don't know why we have a bubble between 31 and 34...)

There were two IIT w.r.t legacy pointer types that got deprecated after we adopted opaque pointers.

Why don't we make them consecutive?

I was going to make them more compact, but now you pointed out I think it's not really necessary. It is fixed now.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mshockwave mshockwave merged commit 5a1e16f into llvm:main Feb 5, 2025
8 checks passed
@mshockwave mshockwave deleted the patch/vp-interleaved-357 branch February 5, 2025 23:30
@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 6, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-devrel-x86-64 running on ml-opt-devrel-x86-64-b1 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/175/builds/12767

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-devrel-x86-64-b1/build/bin/llc < /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /b/ml-opt-devrel-x86-64-b1/build/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
+ /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
/b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
          288:  vslidedown.vi v10, v8, 8 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
          289:  vslidedown.vi v9, v8, 2 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          290:  srli s0, s1, 3 
next:250'0     ~~~~~~~~~~~~~~~~
          291:  add a0, s0, s0 
next:250'0     ~~~~~~~~~~~~~~~~
          292:  vsetvli zero, a0, e8, mf2, tu, ma 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 6, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-rel-x86-64 running on ml-opt-rel-x86-64-b2 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/185/builds/12729

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-rel-x86-64-b1/build/bin/llc < /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /b/ml-opt-rel-x86-64-b1/build/bin/FileCheck /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /b/ml-opt-rel-x86-64-b1/build/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
+ /b/ml-opt-rel-x86-64-b1/build/bin/FileCheck /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
          288:  vslidedown.vi v10, v8, 8 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
          289:  vslidedown.vi v9, v8, 2 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          290:  srli s0, s1, 3 
next:250'0     ~~~~~~~~~~~~~~~~
          291:  add a0, s0, s0 
next:250'0     ~~~~~~~~~~~~~~~~
          292:  vsetvli zero, a0, e8, mf2, tu, ma 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 6, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-dev-x86-64 running on ml-opt-dev-x86-64-b2 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/137/builds/12912

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-dev-x86-64-b1/build/bin/llc < /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /b/ml-opt-dev-x86-64-b1/build/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
+ /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
/b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
          288:  vslidedown.vi v10, v8, 8 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
          289:  vslidedown.vi v9, v8, 2 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          290:  srli s0, s1, 3 
next:250'0     ~~~~~~~~~~~~~~~~
          291:  add a0, s0, s0 
next:250'0     ~~~~~~~~~~~~~~~~
          292:  vsetvli zero, a0, e8, mf2, tu, ma 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
...

mshockwave added a commit that referenced this pull request Feb 6, 2025
@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 6, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-bootstrap-hwasan running on sanitizer-buildbot11 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/55/builds/6639

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 86317 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40..
FAIL: LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll (41556 of 86317)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llc < /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
Step 11 (stage2/hwasan check) failure: stage2/hwasan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 86317 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40..
FAIL: LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll (41556 of 86317)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llc < /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
Step 14 (stage3/hwasan check) failure: stage3/hwasan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 83438 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40..
FAIL: LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll (41550 of 83438)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/llc < /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build2_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match

@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 6, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-ubuntu-fast running on as-builder-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/33/builds/10918

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/llc < /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
          288:  vslidedown.vi v10, v8, 8 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
          289:  vslidedown.vi v9, v8, 2 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          290:  srli s0, s1, 3 
next:250'0     ~~~~~~~~~~~~~~~~
          291:  add a0, s0, s0 
next:250'0     ~~~~~~~~~~~~~~~~
          292:  vsetvli zero, a0, e8, mf2, tu, ma 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 6, 2025

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/18021

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llc < /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
/b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
          288:  vslidedown.vi v10, v8, 8 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
          289:  vslidedown.vi v9, v8, 2 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          290:  srli s0, s1, 3 
next:250'0     ~~~~~~~~~~~~~~~~
          291:  add a0, s0, s0 
next:250'0     ~~~~~~~~~~~~~~~~
          292:  vsetvli zero, a0, e8, mf2, tu, ma 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 6, 2025

LLVM Buildbot has detected a new failure on builder llvm-x86_64-debian-dylib running on gribozavr4 while building llvm at step 7 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/60/builds/18926

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-x86_64-debian-dylib/build/bin/llc < /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /b/1/llvm-x86_64-debian-dylib/build/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
          288:  vslidedown.vi v10, v8, 8 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
          289:  vslidedown.vi v9, v8, 2 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          290:  srli s0, s1, 3 
next:250'0     ~~~~~~~~~~~~~~~~
          291:  add a0, s0, s0 
next:250'0     ~~~~~~~~~~~~~~~~
          292:  vsetvli zero, a0, e8, mf2, tu, ma 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 6, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/13338

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll -mtriple=riscv32 -mattr=+v,+zvfh | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll --check-prefixes=CHECK,RV32
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc -mtriple=riscv32 -mattr=+v,+zvfh
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll:250:14: error: RV32-NEXT: expected string not found in input
; RV32-NEXT: vslidedown.vi v10, v8, 10
             ^
<stdin>:286:34: note: scanning from here
 vsetivli zero, 2, e8, m1, ta, ma
                                 ^
<stdin>:287:2: note: possible intended match here
 vslidedown.vi v11, v8, 10
 ^

Input file: <stdin>
Check file: /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/RISCV/rvv/vector-deinterleave-fixed.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
          281:  sub sp, sp, a0 
          282:  .cfi_escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x30, 0x22, 0x11, 0x04, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22 # sp + 48 + 4 * vlenb 
          283:  addi a0, sp, 32 
          284:  vs1r.v v8, (a0) # Unknown-size Folded Spill 
          285:  csrr s1, vlenb 
          286:  vsetivli zero, 2, e8, m1, ta, ma 
next:250'0                                      X error: no match found
          287:  vslidedown.vi v11, v8, 10 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
next:250'1      ?                          possible intended match
          288:  vslidedown.vi v10, v8, 8 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
          289:  vslidedown.vi v9, v8, 2 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~
          290:  srli s0, s1, 3 
next:250'0     ~~~~~~~~~~~~~~~~
          291:  add a0, s0, s0 
next:250'0     ~~~~~~~~~~~~~~~~
          292:  vsetvli zero, a0, e8, mf2, tu, ma 
next:250'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            .
            .
...

@mikaelholmen
Copy link
Collaborator

Many bot failures above. I can just add what I've seen manually:
If I build with EXPENSIVE_CHECKS I see that test/CodeGen/RISCV/rvv/vector-interleave.ll fails like

# After Post-RA pseudo instruction expansion pass
# Machine code for function vector_interleave_nxv80i1_nxv16i1: NoPHIs, TracksLiveness, NoVRegs, TiedOpsRewritten, TracksDebugUserValues
Frame Objects:
  fi#0: id=2 size=40, align=8, at location [SP-40]
  fi#1: id=2 size=40, align=8, at location [SP-80]
  fi#2: size=4, align=4, at location [SP-16]
Function Live Ins: $v0, $v8, $v9, $v10, $v11

bb.0 (%ir-block.0):
  liveins: $v0, $v8, $v9, $v10, $v11
  $x2 = frame-setup ADDI $x2, -16
  frame-setup CFI_INSTRUCTION def_cfa_offset 16
  $x10 = frame-setup PseudoReadVLENB
  $x11 = frame-setup ADDI $x0, 10
  $x10 = frame-setup MUL killed $x10, killed $x11
  $x2 = frame-setup SUB $x2, killed $x10
  frame-setup CFI_INSTRUCTION escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x0a, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  renamable $v12m2 = PseudoVMV_V_I_M2 undef renamable $v12m2(tied-def 0), 0, -1, 3, 0, implicit $vl, implicit $vtype
  renamable $x14 = ADDI $x2, 16
  $x10 = PseudoReadVLENB
  $x11 = SLLI $x10, 2
  $x10 = ADD killed $x11, killed $x10
  $x10 = ADD $x2, killed $x10
  renamable $x10 = ADDI killed $x10, 16
  renamable $x11 = PseudoReadVLENB
  renamable $v14m2 = PseudoVMERGE_VIM_M2 undef renamable $v14m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v0 = VMV1R_V killed $v8, implicit $vtype, implicit $vtype
  renamable $v18m2 = PseudoVMERGE_VIM_M2 undef renamable $v18m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  renamable $x12 = ADD renamable $x14, renamable $x11
  renamable $x13 = SRLI renamable $x11, 2
  $v20m2 = VMV2R_V $v14m2, implicit $vtype
  $v22m2 = VMV2R_V $v16m2, implicit $vtype
  $v24 = VMV1R_V $v18, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v9, implicit $vtype, implicit $vtype
  renamable $v16m2 = PseudoVMERGE_VIM_M2 undef renamable $v16m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v21 = VMV1R_V $v18, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v10, implicit $vtype, implicit $vtype
  renamable $v8m2 = PseudoVMERGE_VIM_M2 undef renamable $v8m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v22 = VMV1R_V $v16, implicit $vtype, implicit $vtype
  $v16 = VMV1R_V killed $v19, implicit $vtype, implicit $vtype
  renamable $x15 = ADD renamable $x12, renamable $x11
  $v23 = VMV1R_V $v8, implicit $vtype, implicit $vtype
  $v18 = VMV1R_V killed $v9, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v11, implicit $vtype, implicit $vtype
  renamable $v24m2 = PseudoVMERGE_VIM_M2 undef renamable $v24m2(tied-def 0), killed renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  dead renamable $x16 = PseudoVSETVLIX0 killed $x0, 192, implicit-def $vl, implicit-def $vtype
  PseudoVSSEG5E8_V_M1 renamable $v20_v21_v22_v23_v24, renamable $x14, -1, 3, implicit $vl, implicit $vtype :: (store unknown-size into %stack.1, align 8)
  $v19 = VMV1R_V killed $v25, implicit $vtype, implicit $vtype
  PseudoVSSEG5E8_V_M1 killed renamable $v15_v16_v17_v18_v19, renamable $x10, -1, 3, implicit $vl, implicit $vtype :: (store unknown-size into %stack.0, align 8)
  renamable $v8 = VL1RE8_V renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x15 = ADD killed renamable $x15, renamable $x11
  renamable $v10 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x14 = ADD renamable $x15, renamable $x11
  renamable $v12 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x14 = ADD renamable $x10, renamable $x11
  renamable $v14 = VL1RE8_V renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x14 = ADD killed renamable $x14, renamable $x11
  renamable $v9 = VL1RE8_V killed renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x15 = ADD renamable $x14, renamable $x11
  renamable $v16 = VL1RE8_V renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x15 = ADD killed renamable $x15, renamable $x11
  renamable $x11 = SRLI killed renamable $x11, 1
  renamable $v11 = VL1RE8_V killed renamable $x12 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x12 = ADD renamable $x13, renamable $x13
  renamable $v15 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x14 = ADD renamable $x11, renamable $x11
  renamable $v13 = VL1RE8_V killed renamable $x10 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $v17 = VL1RE8_V killed renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.0)
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v18 = PseudoVMSNE_VI_M2 killed renamable $v8m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v0 = PseudoVMSNE_VI_M2 killed renamable $v10m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v8 = PseudoVMSNE_VI_M2 killed renamable $v14m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v9 = PseudoVMSNE_VI_M2 killed renamable $v12m2, 0, -1, 3, implicit $vl, implicit $vtype
  dead $x0 = PseudoVSETVLI killed renamable $x12, 199, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v0 = PseudoVSLIDEUP_VX_MF2 killed renamable $v0(tied-def 0), killed renamable $v18, renamable $x13, $noreg, 3, 1, implicit $vl, implicit $vtype
  early-clobber renamable $v9 = PseudoVSLIDEUP_VX_MF2 killed renamable $v9(tied-def 0), killed renamable $v8, killed renamable $x13, $noreg, 3, 1, implicit $vl, implicit $vtype
  dead $x0 = PseudoVSETVLI killed renamable $x14, 192, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v0 = PseudoVSLIDEUP_VX_M1 killed renamable $v0(tied-def 0), killed renamable $v9, killed renamable $x11, $noreg, 3, 1, implicit $vl, implicit $vtype
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v8 = PseudoVMSNE_VI_M2 killed renamable $v16m2, 0, -1, 3, implicit $vl, implicit $vtype
  $x10 = frame-destroy PseudoReadVLENB
  $x11 = frame-destroy ADDI $x0, 10
  $x10 = frame-destroy MUL killed $x10, killed $x11
  $x2 = frame-destroy ADD $x2, killed $x10
  frame-destroy CFI_INSTRUCTION def_cfa $x2, 16
  $x2 = frame-destroy ADDI $x2, 16
  frame-destroy CFI_INSTRUCTION def_cfa_offset 0
  PseudoRET implicit $v0, implicit $v8

# End machine code for function vector_interleave_nxv80i1_nxv16i1.

*** Bad machine code: Using an undefined physical register ***
- function:    vector_interleave_nxv80i1_nxv16i1
- basic block: %bb.0  (0x56042f143608)
- instruction: $v22m2 = VMV2R_V $v16m2, implicit $vtype
- operand 1:   $v16m2
LLVM ERROR: Found 1 machine code errors.

@mshockwave
Copy link
Member Author

Many bot failures above. I can just add what I've seen manually: If I build with EXPENSIVE_CHECKS I see that test/CodeGen/RISCV/rvv/vector-interleave.ll fails like

# After Post-RA pseudo instruction expansion pass
# Machine code for function vector_interleave_nxv80i1_nxv16i1: NoPHIs, TracksLiveness, NoVRegs, TiedOpsRewritten, TracksDebugUserValues
Frame Objects:
  fi#0: id=2 size=40, align=8, at location [SP-40]
  fi#1: id=2 size=40, align=8, at location [SP-80]
  fi#2: size=4, align=4, at location [SP-16]
Function Live Ins: $v0, $v8, $v9, $v10, $v11

bb.0 (%ir-block.0):
  liveins: $v0, $v8, $v9, $v10, $v11
  $x2 = frame-setup ADDI $x2, -16
  frame-setup CFI_INSTRUCTION def_cfa_offset 16
  $x10 = frame-setup PseudoReadVLENB
  $x11 = frame-setup ADDI $x0, 10
  $x10 = frame-setup MUL killed $x10, killed $x11
  $x2 = frame-setup SUB $x2, killed $x10
  frame-setup CFI_INSTRUCTION escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x0a, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  renamable $v12m2 = PseudoVMV_V_I_M2 undef renamable $v12m2(tied-def 0), 0, -1, 3, 0, implicit $vl, implicit $vtype
  renamable $x14 = ADDI $x2, 16
  $x10 = PseudoReadVLENB
  $x11 = SLLI $x10, 2
  $x10 = ADD killed $x11, killed $x10
  $x10 = ADD $x2, killed $x10
  renamable $x10 = ADDI killed $x10, 16
  renamable $x11 = PseudoReadVLENB
  renamable $v14m2 = PseudoVMERGE_VIM_M2 undef renamable $v14m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v0 = VMV1R_V killed $v8, implicit $vtype, implicit $vtype
  renamable $v18m2 = PseudoVMERGE_VIM_M2 undef renamable $v18m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  renamable $x12 = ADD renamable $x14, renamable $x11
  renamable $x13 = SRLI renamable $x11, 2
  $v20m2 = VMV2R_V $v14m2, implicit $vtype
  $v22m2 = VMV2R_V $v16m2, implicit $vtype
  $v24 = VMV1R_V $v18, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v9, implicit $vtype, implicit $vtype
  renamable $v16m2 = PseudoVMERGE_VIM_M2 undef renamable $v16m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v21 = VMV1R_V $v18, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v10, implicit $vtype, implicit $vtype
  renamable $v8m2 = PseudoVMERGE_VIM_M2 undef renamable $v8m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v22 = VMV1R_V $v16, implicit $vtype, implicit $vtype
  $v16 = VMV1R_V killed $v19, implicit $vtype, implicit $vtype
  renamable $x15 = ADD renamable $x12, renamable $x11
  $v23 = VMV1R_V $v8, implicit $vtype, implicit $vtype
  $v18 = VMV1R_V killed $v9, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v11, implicit $vtype, implicit $vtype
  renamable $v24m2 = PseudoVMERGE_VIM_M2 undef renamable $v24m2(tied-def 0), killed renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  dead renamable $x16 = PseudoVSETVLIX0 killed $x0, 192, implicit-def $vl, implicit-def $vtype
  PseudoVSSEG5E8_V_M1 renamable $v20_v21_v22_v23_v24, renamable $x14, -1, 3, implicit $vl, implicit $vtype :: (store unknown-size into %stack.1, align 8)
  $v19 = VMV1R_V killed $v25, implicit $vtype, implicit $vtype
  PseudoVSSEG5E8_V_M1 killed renamable $v15_v16_v17_v18_v19, renamable $x10, -1, 3, implicit $vl, implicit $vtype :: (store unknown-size into %stack.0, align 8)
  renamable $v8 = VL1RE8_V renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x15 = ADD killed renamable $x15, renamable $x11
  renamable $v10 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x14 = ADD renamable $x15, renamable $x11
  renamable $v12 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x14 = ADD renamable $x10, renamable $x11
  renamable $v14 = VL1RE8_V renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x14 = ADD killed renamable $x14, renamable $x11
  renamable $v9 = VL1RE8_V killed renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x15 = ADD renamable $x14, renamable $x11
  renamable $v16 = VL1RE8_V renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x15 = ADD killed renamable $x15, renamable $x11
  renamable $x11 = SRLI killed renamable $x11, 1
  renamable $v11 = VL1RE8_V killed renamable $x12 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x12 = ADD renamable $x13, renamable $x13
  renamable $v15 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x14 = ADD renamable $x11, renamable $x11
  renamable $v13 = VL1RE8_V killed renamable $x10 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $v17 = VL1RE8_V killed renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.0)
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v18 = PseudoVMSNE_VI_M2 killed renamable $v8m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v0 = PseudoVMSNE_VI_M2 killed renamable $v10m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v8 = PseudoVMSNE_VI_M2 killed renamable $v14m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v9 = PseudoVMSNE_VI_M2 killed renamable $v12m2, 0, -1, 3, implicit $vl, implicit $vtype
  dead $x0 = PseudoVSETVLI killed renamable $x12, 199, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v0 = PseudoVSLIDEUP_VX_MF2 killed renamable $v0(tied-def 0), killed renamable $v18, renamable $x13, $noreg, 3, 1, implicit $vl, implicit $vtype
  early-clobber renamable $v9 = PseudoVSLIDEUP_VX_MF2 killed renamable $v9(tied-def 0), killed renamable $v8, killed renamable $x13, $noreg, 3, 1, implicit $vl, implicit $vtype
  dead $x0 = PseudoVSETVLI killed renamable $x14, 192, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v0 = PseudoVSLIDEUP_VX_M1 killed renamable $v0(tied-def 0), killed renamable $v9, killed renamable $x11, $noreg, 3, 1, implicit $vl, implicit $vtype
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v8 = PseudoVMSNE_VI_M2 killed renamable $v16m2, 0, -1, 3, implicit $vl, implicit $vtype
  $x10 = frame-destroy PseudoReadVLENB
  $x11 = frame-destroy ADDI $x0, 10
  $x10 = frame-destroy MUL killed $x10, killed $x11
  $x2 = frame-destroy ADD $x2, killed $x10
  frame-destroy CFI_INSTRUCTION def_cfa $x2, 16
  $x2 = frame-destroy ADDI $x2, 16
  frame-destroy CFI_INSTRUCTION def_cfa_offset 0
  PseudoRET implicit $v0, implicit $v8

# End machine code for function vector_interleave_nxv80i1_nxv16i1.

*** Bad machine code: Using an undefined physical register ***
- function:    vector_interleave_nxv80i1_nxv16i1
- basic block: %bb.0  (0x56042f143608)
- instruction: $v22m2 = VMV2R_V $v16m2, implicit $vtype
- operand 1:   $v16m2
LLVM ERROR: Found 1 machine code errors.

Most of the buildbot failures should be fixed by e335ca7

I'm looking into the expensive check failure

@mshockwave
Copy link
Member Author

Many bot failures above. I can just add what I've seen manually: If I build with EXPENSIVE_CHECKS I see that test/CodeGen/RISCV/rvv/vector-interleave.ll fails like

# After Post-RA pseudo instruction expansion pass
# Machine code for function vector_interleave_nxv80i1_nxv16i1: NoPHIs, TracksLiveness, NoVRegs, TiedOpsRewritten, TracksDebugUserValues
Frame Objects:
  fi#0: id=2 size=40, align=8, at location [SP-40]
  fi#1: id=2 size=40, align=8, at location [SP-80]
  fi#2: size=4, align=4, at location [SP-16]
Function Live Ins: $v0, $v8, $v9, $v10, $v11

bb.0 (%ir-block.0):
  liveins: $v0, $v8, $v9, $v10, $v11
  $x2 = frame-setup ADDI $x2, -16
  frame-setup CFI_INSTRUCTION def_cfa_offset 16
  $x10 = frame-setup PseudoReadVLENB
  $x11 = frame-setup ADDI $x0, 10
  $x10 = frame-setup MUL killed $x10, killed $x11
  $x2 = frame-setup SUB $x2, killed $x10
  frame-setup CFI_INSTRUCTION escape 0x0f, 0x0d, 0x72, 0x00, 0x11, 0x10, 0x22, 0x11, 0x0a, 0x92, 0xa2, 0x38, 0x00, 0x1e, 0x22
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  renamable $v12m2 = PseudoVMV_V_I_M2 undef renamable $v12m2(tied-def 0), 0, -1, 3, 0, implicit $vl, implicit $vtype
  renamable $x14 = ADDI $x2, 16
  $x10 = PseudoReadVLENB
  $x11 = SLLI $x10, 2
  $x10 = ADD killed $x11, killed $x10
  $x10 = ADD $x2, killed $x10
  renamable $x10 = ADDI killed $x10, 16
  renamable $x11 = PseudoReadVLENB
  renamable $v14m2 = PseudoVMERGE_VIM_M2 undef renamable $v14m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v0 = VMV1R_V killed $v8, implicit $vtype, implicit $vtype
  renamable $v18m2 = PseudoVMERGE_VIM_M2 undef renamable $v18m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  renamable $x12 = ADD renamable $x14, renamable $x11
  renamable $x13 = SRLI renamable $x11, 2
  $v20m2 = VMV2R_V $v14m2, implicit $vtype
  $v22m2 = VMV2R_V $v16m2, implicit $vtype
  $v24 = VMV1R_V $v18, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v9, implicit $vtype, implicit $vtype
  renamable $v16m2 = PseudoVMERGE_VIM_M2 undef renamable $v16m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v21 = VMV1R_V $v18, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v10, implicit $vtype, implicit $vtype
  renamable $v8m2 = PseudoVMERGE_VIM_M2 undef renamable $v8m2(tied-def 0), renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  $v22 = VMV1R_V $v16, implicit $vtype, implicit $vtype
  $v16 = VMV1R_V killed $v19, implicit $vtype, implicit $vtype
  renamable $x15 = ADD renamable $x12, renamable $x11
  $v23 = VMV1R_V $v8, implicit $vtype, implicit $vtype
  $v18 = VMV1R_V killed $v9, implicit $vtype, implicit $vtype
  $v0 = VMV1R_V killed $v11, implicit $vtype, implicit $vtype
  renamable $v24m2 = PseudoVMERGE_VIM_M2 undef renamable $v24m2(tied-def 0), killed renamable $v12m2, 1, $v0, -1, 3, implicit $vl, implicit $vtype
  dead renamable $x16 = PseudoVSETVLIX0 killed $x0, 192, implicit-def $vl, implicit-def $vtype
  PseudoVSSEG5E8_V_M1 renamable $v20_v21_v22_v23_v24, renamable $x14, -1, 3, implicit $vl, implicit $vtype :: (store unknown-size into %stack.1, align 8)
  $v19 = VMV1R_V killed $v25, implicit $vtype, implicit $vtype
  PseudoVSSEG5E8_V_M1 killed renamable $v15_v16_v17_v18_v19, renamable $x10, -1, 3, implicit $vl, implicit $vtype :: (store unknown-size into %stack.0, align 8)
  renamable $v8 = VL1RE8_V renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x15 = ADD killed renamable $x15, renamable $x11
  renamable $v10 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x14 = ADD renamable $x15, renamable $x11
  renamable $v12 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x14 = ADD renamable $x10, renamable $x11
  renamable $v14 = VL1RE8_V renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x14 = ADD killed renamable $x14, renamable $x11
  renamable $v9 = VL1RE8_V killed renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x15 = ADD renamable $x14, renamable $x11
  renamable $v16 = VL1RE8_V renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x15 = ADD killed renamable $x15, renamable $x11
  renamable $x11 = SRLI killed renamable $x11, 1
  renamable $v11 = VL1RE8_V killed renamable $x12 :: (load (<vscale x 1 x s64>) from %stack.1)
  renamable $x12 = ADD renamable $x13, renamable $x13
  renamable $v15 = VL1RE8_V killed renamable $x14 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $x14 = ADD renamable $x11, renamable $x11
  renamable $v13 = VL1RE8_V killed renamable $x10 :: (load (<vscale x 1 x s64>) from %stack.0)
  renamable $v17 = VL1RE8_V killed renamable $x15 :: (load (<vscale x 1 x s64>) from %stack.0)
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v18 = PseudoVMSNE_VI_M2 killed renamable $v8m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v0 = PseudoVMSNE_VI_M2 killed renamable $v10m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v8 = PseudoVMSNE_VI_M2 killed renamable $v14m2, 0, -1, 3, implicit $vl, implicit $vtype
  early-clobber renamable $v9 = PseudoVMSNE_VI_M2 killed renamable $v12m2, 0, -1, 3, implicit $vl, implicit $vtype
  dead $x0 = PseudoVSETVLI killed renamable $x12, 199, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v0 = PseudoVSLIDEUP_VX_MF2 killed renamable $v0(tied-def 0), killed renamable $v18, renamable $x13, $noreg, 3, 1, implicit $vl, implicit $vtype
  early-clobber renamable $v9 = PseudoVSLIDEUP_VX_MF2 killed renamable $v9(tied-def 0), killed renamable $v8, killed renamable $x13, $noreg, 3, 1, implicit $vl, implicit $vtype
  dead $x0 = PseudoVSETVLI killed renamable $x14, 192, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v0 = PseudoVSLIDEUP_VX_M1 killed renamable $v0(tied-def 0), killed renamable $v9, killed renamable $x11, $noreg, 3, 1, implicit $vl, implicit $vtype
  dead renamable $x10 = PseudoVSETVLIX0 killed $x0, 193, implicit-def $vl, implicit-def $vtype
  early-clobber renamable $v8 = PseudoVMSNE_VI_M2 killed renamable $v16m2, 0, -1, 3, implicit $vl, implicit $vtype
  $x10 = frame-destroy PseudoReadVLENB
  $x11 = frame-destroy ADDI $x0, 10
  $x10 = frame-destroy MUL killed $x10, killed $x11
  $x2 = frame-destroy ADD $x2, killed $x10
  frame-destroy CFI_INSTRUCTION def_cfa $x2, 16
  $x2 = frame-destroy ADDI $x2, 16
  frame-destroy CFI_INSTRUCTION def_cfa_offset 0
  PseudoRET implicit $v0, implicit $v8

# End machine code for function vector_interleave_nxv80i1_nxv16i1.

*** Bad machine code: Using an undefined physical register ***
- function:    vector_interleave_nxv80i1_nxv16i1
- basic block: %bb.0  (0x56042f143608)
- instruction: $v22m2 = VMV2R_V $v16m2, implicit $vtype
- operand 1:   $v16m2
LLVM ERROR: Found 1 machine code errors.

Most of the buildbot failures should be fixed by e335ca7

I'm looking into the expensive check failure

Fix: #126155

mshockwave added a commit that referenced this pull request Feb 9, 2025
Somtimes when we're breaking up a large vector copy into several smaller
ones, not every single smaller source registers are initialized at the
time when the original COPY happens, and the verifier will not be
pleased when seeing the smaller copies reading from an undef register.
This patch is a workaround for the said issue by attaching an implicit
read of the source operand on the newly generated copies.

This is tested by llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll which
would have crashed the compiler without this fix when
LLVM_EXPENSIVE_CHECK is enabled. Original context:
#124825 (comment)

---------

Co-authored-by: Craig Topper <[email protected]>
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 9, 2025
… (#126155)

Somtimes when we're breaking up a large vector copy into several smaller
ones, not every single smaller source registers are initialized at the
time when the original COPY happens, and the verifier will not be
pleased when seeing the smaller copies reading from an undef register.
This patch is a workaround for the said issue by attaching an implicit
read of the source operand on the newly generated copies.

This is tested by llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll which
would have crashed the compiler without this fix when
LLVM_EXPENSIVE_CHECK is enabled. Original context:
llvm/llvm-project#124825 (comment)

---------

Co-authored-by: Craig Topper <[email protected]>
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
These three intrinsics are similar to llvm.vector.(de)interleave2 but
work with 3/5/7 vector operands or results.
For RISC-V, it's important to have them in order to support segmented
load/store with factor of 2 to 8: factor of 2/4/8 can be synthesized
from (de)interleave2; factor of 6 can be synthesized from factor of 2
and 3; factor 5 and 7 have their own intrinsics added by this patch.

This patch only adds codegen support for these intrinsics, we still need
to teach vectorizer to generate them as well as teaching
InterleavedAccessPass to use them.

---------

Co-authored-by: Craig Topper <[email protected]>
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
Somtimes when we're breaking up a large vector copy into several smaller
ones, not every single smaller source registers are initialized at the
time when the original COPY happens, and the verifier will not be
pleased when seeing the smaller copies reading from an undef register.
This patch is a workaround for the said issue by attaching an implicit
read of the source operand on the newly generated copies.

This is tested by llvm/test/CodeGen/RISCV/rvv/vector-interleave.ll which
would have crashed the compiler without this fix when
LLVM_EXPENSIVE_CHECK is enabled. Original context:
llvm#124825 (comment)

---------

Co-authored-by: Craig Topper <[email protected]>
lukel97 added a commit that referenced this pull request Jun 12, 2025
)

Currently the loop vectorizer can only vectorize interleave groups for
power-of-2 factors at scalable VFs by recursively interleaving
[de]interleave2 intrinsics.

However after #124825 and
#139893, we now have [de]interleave intrinsics for all factors up to 8,
which is enough to support all types of segmented loads and stores on
RISC-V.

Now that the interleaved access pass has been taught to lower these in
#139373 and #141512, this patch teaches the loop vectorizer to emit
these intrinsics for factors up to 8, which enables scalable
vectorization for non-power-of-2 factors.

As far as I'm aware, no in-tree target will vectorize a scalable
interelave group above factor 8 because the maximum interleave factor is
capped at 4 on AArch64 and 8 on RISC-V, and the
`-max-interleave-group-factor` CLI option defaults to 8, so the
recursive [de]interleaving code has been removed for now.

Factors of 3 with scalable VFs are also turned off in AArch64 since
there's no lowering for [de]interleave3 just yet either.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jun 12, 2025
…and 7 (#141865)

Currently the loop vectorizer can only vectorize interleave groups for
power-of-2 factors at scalable VFs by recursively interleaving
[de]interleave2 intrinsics.

However after llvm/llvm-project#124825 and
#139893, we now have [de]interleave intrinsics for all factors up to 8,
which is enough to support all types of segmented loads and stores on
RISC-V.

Now that the interleaved access pass has been taught to lower these in
#139373 and #141512, this patch teaches the loop vectorizer to emit
these intrinsics for factors up to 8, which enables scalable
vectorization for non-power-of-2 factors.

As far as I'm aware, no in-tree target will vectorize a scalable
interelave group above factor 8 because the maximum interleave factor is
capped at 4 on AArch64 and 8 on RISC-V, and the
`-max-interleave-group-factor` CLI option defaults to 8, so the
recursive [de]interleaving code has been removed for now.

Factors of 3 with scalable VFs are also turned off in AArch64 since
there's no lowering for [de]interleave3 just yet either.
tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
…#141865)

Currently the loop vectorizer can only vectorize interleave groups for
power-of-2 factors at scalable VFs by recursively interleaving
[de]interleave2 intrinsics.

However after llvm#124825 and
llvm#139893, we now have [de]interleave intrinsics for all factors up to 8,
which is enough to support all types of segmented loads and stores on
RISC-V.

Now that the interleaved access pass has been taught to lower these in
llvm#139373 and llvm#141512, this patch teaches the loop vectorizer to emit
these intrinsics for factors up to 8, which enables scalable
vectorization for non-power-of-2 factors.

As far as I'm aware, no in-tree target will vectorize a scalable
interelave group above factor 8 because the maximum interleave factor is
capped at 4 on AArch64 and 8 on RISC-V, and the
`-max-interleave-group-factor` CLI option defaults to 8, so the
recursive [de]interleaving code has been removed for now.

Factors of 3 with scalable VFs are also turned off in AArch64 since
there's no lowering for [de]interleave3 just yet either.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants