Skip to content

[RISCV] Directly use pack* in build_vector lowering #98084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 8, 2024

Conversation

preames
Copy link
Collaborator

@preames preames commented Jul 8, 2024

In 03d4332, we extended build_vector lowering to pack elements into the largest size which doesn't exceed either ELEN or XLEN. The zbkb extension - ratified under scalar crypto, but otherwise not really connected to crypto per se - adds the packh, packw, and pack instructions. These instructions are designed for exactly this pairwise packing.

I ended up choosing to directly lower to machine nodes. A combination of the slightly non-uniform semantics of these instructions (packw sign extends the result, whereas packh zero extends it), and our generic dag canonicalization (which sinks shl through or nodes), make pattern matching these tricky and not particularly robust. Another alternative was to have an ISD node for them, but that didn't seem to add much in practice.

In 03d4332, we extended build_vector lowering to pack elements into the
largest size which doesn't exceed either ELEN or XLEN.  The zbkb
extension - ratified under scalar crypto, but otherwise not really
connected to crypto per se - adds the packh, packw, and pack instructions.
These instructions are designed for exactly this pairwise packing.

I ended up chosing to directly lower to machine nodes.  A combination
of the slightly non-uniform semantics of these instructions (packw
*sign* extends the result, whereas packh *zero* extends it), and our
generic dag canonicalization (which sinks shl through or nodes), make
pattern matching these tricky and not particularly robust.  Another
alternative was to have an ISD node for them, but that didn't seem
to add much in practice.
@llvmbot
Copy link
Member

llvmbot commented Jul 8, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Philip Reames (preames)

Changes

In 03d4332, we extended build_vector lowering to pack elements into the largest size which doesn't exceed either ELEN or XLEN. The zbkb extension - ratified under scalar crypto, but otherwise not really connected to crypto per se - adds the packh, packw, and pack instructions. These instructions are designed for exactly this pairwise packing.

I ended up choosing to directly lower to machine nodes. A combination of the slightly non-uniform semantics of these instructions (packw sign extends the result, whereas packh zero extends it), and our generic dag canonicalization (which sinks shl through or nodes), make pattern matching these tricky and not particularly robust. Another alternative was to have an ISD node for them, but that didn't seem to add much in practice.


Patch is 32.98 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/98084.diff

2 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+32-13)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll (+235-317)
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index ef76705d8f662..7972b9abc456c 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -3905,6 +3905,21 @@ static SDValue lowerBuildVectorOfConstants(SDValue Op, SelectionDAG &DAG,
   return SDValue();
 }
 
+static unsigned getPACKOpcode(unsigned DestBW,
+                              const RISCVSubtarget &Subtarget) {
+  switch (DestBW) {
+  default:
+    llvm_unreachable("Unsupported pack size");
+  case 16:
+    return RISCV::PACKH;
+  case 32:
+    return Subtarget.is64Bit() ? RISCV::PACKW : RISCV::PACK;
+  case 64:
+    assert(Subtarget.is64Bit());
+    return RISCV::PACK;
+  }
+}
+
 /// Double the element size of the build vector to reduce the number
 /// of vslide1down in the build vector chain.  In the worst case, this
 /// trades three scalar operations for 1 vector operation.  Scalar
@@ -3933,30 +3948,34 @@ static SDValue lowerBuildVectorViaPacking(SDValue Op, SelectionDAG &DAG,
   // Produce [B,A] packed into a type twice as wide.  Note that all
   // scalars are XLenVT, possibly masked (see below).
   MVT XLenVT = Subtarget.getXLenVT();
+  SDValue Mask = DAG.getConstant(
+      APInt::getLowBitsSet(XLenVT.getSizeInBits(), ElemSizeInBits), DL, XLenVT);
   auto pack = [&](SDValue A, SDValue B) {
     // Bias the scheduling of the inserted operations to near the
     // definition of the element - this tends to reduce register
     // pressure overall.
     SDLoc ElemDL(B);
+    if (Subtarget.hasStdExtZbkb())
+      // Note that we're relying on the high bits of the result being
+      // don't care.  For PACKW, the result is *sign* extended.
+      return SDValue(
+          DAG.getMachineNode(getPACKOpcode(ElemSizeInBits * 2, Subtarget),
+                             ElemDL, XLenVT, A, B),
+          0);
+
+    A = DAG.getNode(ISD::AND, SDLoc(A), XLenVT, A, Mask);
+    B = DAG.getNode(ISD::AND, SDLoc(B), XLenVT, B, Mask);
     SDValue ShtAmt = DAG.getConstant(ElemSizeInBits, ElemDL, XLenVT);
+    SDNodeFlags Flags;
+    Flags.setDisjoint(true);
     return DAG.getNode(ISD::OR, ElemDL, XLenVT, A,
-                       DAG.getNode(ISD::SHL, ElemDL, XLenVT, B, ShtAmt));
+                       DAG.getNode(ISD::SHL, ElemDL, XLenVT, B, ShtAmt), Flags);
   };
 
-  SDValue Mask = DAG.getConstant(
-      APInt::getLowBitsSet(XLenVT.getSizeInBits(), ElemSizeInBits), DL, XLenVT);
   SmallVector<SDValue> NewOperands;
   NewOperands.reserve(NumElts / 2);
-  for (unsigned i = 0; i < VT.getVectorNumElements(); i += 2) {
-    SDValue A = Op.getOperand(i);
-    SDValue B = Op.getOperand(i + 1);
-    // Bias the scheduling of the inserted operations to near the
-    // definition of the element - this tends to reduce register
-    // pressure overall.
-    A = DAG.getNode(ISD::AND, SDLoc(A), XLenVT, A, Mask);
-    B = DAG.getNode(ISD::AND, SDLoc(B), XLenVT, B, Mask);
-    NewOperands.push_back(pack(A, B));
-  }
+  for (unsigned i = 0; i < VT.getVectorNumElements(); i += 2)
+    NewOperands.push_back(pack(Op.getOperand(i), Op.getOperand(i + 1)));
   assert(NumElts == NewOperands.size() * 2);
   MVT WideVT = MVT::getIntegerVT(ElemSizeInBits * 2);
   MVT WideVecVT = MVT::getVectorVT(WideVT, NumElts / 2);
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll
index 03ed6883b537d..6ca96d3551583 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-buildvec.ll
@@ -1283,37 +1283,29 @@ define <16 x i8> @buildvec_v16i8_loads_contigous(ptr %p) {
 ; RV32VB-PACK-NEXT:    lbu a3, 2(a0)
 ; RV32VB-PACK-NEXT:    lbu a4, 3(a0)
 ; RV32VB-PACK-NEXT:    packh a1, a1, a2
-; RV32VB-PACK-NEXT:    slli a3, a3, 16
-; RV32VB-PACK-NEXT:    slli a4, a4, 24
-; RV32VB-PACK-NEXT:    or a3, a4, a3
+; RV32VB-PACK-NEXT:    packh a2, a3, a4
+; RV32VB-PACK-NEXT:    pack a1, a1, a2
 ; RV32VB-PACK-NEXT:    lbu a2, 4(a0)
-; RV32VB-PACK-NEXT:    lbu a4, 5(a0)
-; RV32VB-PACK-NEXT:    lbu a5, 6(a0)
-; RV32VB-PACK-NEXT:    lbu a6, 7(a0)
-; RV32VB-PACK-NEXT:    or a1, a1, a3
-; RV32VB-PACK-NEXT:    packh a2, a2, a4
-; RV32VB-PACK-NEXT:    slli a5, a5, 16
-; RV32VB-PACK-NEXT:    slli a6, a6, 24
-; RV32VB-PACK-NEXT:    or a3, a6, a5
-; RV32VB-PACK-NEXT:    lbu a4, 8(a0)
-; RV32VB-PACK-NEXT:    lbu a5, 9(a0)
-; RV32VB-PACK-NEXT:    lbu a6, 10(a0)
-; RV32VB-PACK-NEXT:    lbu a7, 11(a0)
-; RV32VB-PACK-NEXT:    or a2, a2, a3
+; RV32VB-PACK-NEXT:    lbu a3, 5(a0)
+; RV32VB-PACK-NEXT:    lbu a4, 6(a0)
+; RV32VB-PACK-NEXT:    lbu a5, 7(a0)
+; RV32VB-PACK-NEXT:    lbu a6, 8(a0)
+; RV32VB-PACK-NEXT:    lbu a7, 9(a0)
+; RV32VB-PACK-NEXT:    packh a2, a2, a3
 ; RV32VB-PACK-NEXT:    packh a3, a4, a5
-; RV32VB-PACK-NEXT:    slli a6, a6, 16
-; RV32VB-PACK-NEXT:    slli a7, a7, 24
-; RV32VB-PACK-NEXT:    or a4, a7, a6
-; RV32VB-PACK-NEXT:    lbu a5, 12(a0)
-; RV32VB-PACK-NEXT:    lbu a6, 13(a0)
-; RV32VB-PACK-NEXT:    lbu a7, 14(a0)
+; RV32VB-PACK-NEXT:    pack a2, a2, a3
+; RV32VB-PACK-NEXT:    packh a3, a6, a7
+; RV32VB-PACK-NEXT:    lbu a4, 10(a0)
+; RV32VB-PACK-NEXT:    lbu a5, 11(a0)
+; RV32VB-PACK-NEXT:    lbu a6, 12(a0)
+; RV32VB-PACK-NEXT:    lbu a7, 13(a0)
+; RV32VB-PACK-NEXT:    lbu t0, 14(a0)
 ; RV32VB-PACK-NEXT:    lbu a0, 15(a0)
-; RV32VB-PACK-NEXT:    or a3, a3, a4
-; RV32VB-PACK-NEXT:    packh a4, a5, a6
-; RV32VB-PACK-NEXT:    slli a7, a7, 16
-; RV32VB-PACK-NEXT:    slli a0, a0, 24
-; RV32VB-PACK-NEXT:    or a0, a0, a7
-; RV32VB-PACK-NEXT:    or a0, a4, a0
+; RV32VB-PACK-NEXT:    packh a4, a4, a5
+; RV32VB-PACK-NEXT:    pack a3, a3, a4
+; RV32VB-PACK-NEXT:    packh a4, a6, a7
+; RV32VB-PACK-NEXT:    packh a0, t0, a0
+; RV32VB-PACK-NEXT:    pack a0, a4, a0
 ; RV32VB-PACK-NEXT:    vsetivli zero, 4, e32, m1, ta, ma
 ; RV32VB-PACK-NEXT:    vmv.v.x v8, a1
 ; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, a2
@@ -1420,45 +1412,33 @@ define <16 x i8> @buildvec_v16i8_loads_contigous(ptr %p) {
 ; RVA22U64-PACK-NEXT:    lbu a3, 2(a0)
 ; RVA22U64-PACK-NEXT:    lbu a4, 3(a0)
 ; RVA22U64-PACK-NEXT:    packh a1, a1, a2
-; RVA22U64-PACK-NEXT:    slli a3, a3, 16
-; RVA22U64-PACK-NEXT:    slli a4, a4, 24
-; RVA22U64-PACK-NEXT:    or a3, a3, a4
-; RVA22U64-PACK-NEXT:    lbu a2, 4(a0)
-; RVA22U64-PACK-NEXT:    or a6, a1, a3
-; RVA22U64-PACK-NEXT:    lbu a3, 5(a0)
-; RVA22U64-PACK-NEXT:    lbu a4, 6(a0)
-; RVA22U64-PACK-NEXT:    slli a2, a2, 32
+; RVA22U64-PACK-NEXT:    packh a2, a3, a4
+; RVA22U64-PACK-NEXT:    lbu a3, 4(a0)
+; RVA22U64-PACK-NEXT:    lbu a4, 5(a0)
+; RVA22U64-PACK-NEXT:    packw a6, a1, a2
+; RVA22U64-PACK-NEXT:    lbu a2, 6(a0)
 ; RVA22U64-PACK-NEXT:    lbu a5, 7(a0)
-; RVA22U64-PACK-NEXT:    slli a3, a3, 40
-; RVA22U64-PACK-NEXT:    or a2, a2, a3
-; RVA22U64-PACK-NEXT:    slli a4, a4, 48
-; RVA22U64-PACK-NEXT:    slli a5, a5, 56
-; RVA22U64-PACK-NEXT:    or a4, a4, a5
-; RVA22U64-PACK-NEXT:    or a2, a2, a4
-; RVA22U64-PACK-NEXT:    lbu a3, 8(a0)
-; RVA22U64-PACK-NEXT:    lbu a4, 9(a0)
-; RVA22U64-PACK-NEXT:    lbu a5, 10(a0)
-; RVA22U64-PACK-NEXT:    lbu a1, 11(a0)
-; RVA22U64-PACK-NEXT:    or a2, a6, a2
 ; RVA22U64-PACK-NEXT:    packh a3, a3, a4
-; RVA22U64-PACK-NEXT:    slli a5, a5, 16
-; RVA22U64-PACK-NEXT:    slli a1, a1, 24
-; RVA22U64-PACK-NEXT:    or a1, a1, a5
-; RVA22U64-PACK-NEXT:    lbu a4, 12(a0)
-; RVA22U64-PACK-NEXT:    or a1, a1, a3
-; RVA22U64-PACK-NEXT:    lbu a3, 13(a0)
-; RVA22U64-PACK-NEXT:    lbu a5, 14(a0)
-; RVA22U64-PACK-NEXT:    slli a4, a4, 32
+; RVA22U64-PACK-NEXT:    lbu a4, 8(a0)
+; RVA22U64-PACK-NEXT:    lbu a1, 9(a0)
+; RVA22U64-PACK-NEXT:    packh a2, a2, a5
+; RVA22U64-PACK-NEXT:    packw a2, a3, a2
+; RVA22U64-PACK-NEXT:    pack a6, a6, a2
+; RVA22U64-PACK-NEXT:    packh a7, a4, a1
+; RVA22U64-PACK-NEXT:    lbu a3, 10(a0)
+; RVA22U64-PACK-NEXT:    lbu a4, 11(a0)
+; RVA22U64-PACK-NEXT:    lbu a5, 12(a0)
+; RVA22U64-PACK-NEXT:    lbu a2, 13(a0)
+; RVA22U64-PACK-NEXT:    lbu a1, 14(a0)
 ; RVA22U64-PACK-NEXT:    lbu a0, 15(a0)
-; RVA22U64-PACK-NEXT:    slli a3, a3, 40
-; RVA22U64-PACK-NEXT:    or a3, a3, a4
-; RVA22U64-PACK-NEXT:    slli a5, a5, 48
-; RVA22U64-PACK-NEXT:    slli a0, a0, 56
-; RVA22U64-PACK-NEXT:    or a0, a0, a5
-; RVA22U64-PACK-NEXT:    or a0, a0, a3
-; RVA22U64-PACK-NEXT:    or a0, a0, a1
+; RVA22U64-PACK-NEXT:    packh a3, a3, a4
+; RVA22U64-PACK-NEXT:    packw a3, a7, a3
+; RVA22U64-PACK-NEXT:    packh a2, a5, a2
+; RVA22U64-PACK-NEXT:    packh a0, a1, a0
+; RVA22U64-PACK-NEXT:    packw a0, a2, a0
+; RVA22U64-PACK-NEXT:    pack a0, a3, a0
 ; RVA22U64-PACK-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
-; RVA22U64-PACK-NEXT:    vmv.v.x v8, a2
+; RVA22U64-PACK-NEXT:    vmv.v.x v8, a6
 ; RVA22U64-PACK-NEXT:    vslide1down.vx v8, v8, a0
 ; RVA22U64-PACK-NEXT:    ret
 ;
@@ -1653,37 +1633,29 @@ define <16 x i8> @buildvec_v16i8_loads_gather(ptr %p) {
 ; RV32VB-PACK-NEXT:    lbu a3, 22(a0)
 ; RV32VB-PACK-NEXT:    lbu a4, 31(a0)
 ; RV32VB-PACK-NEXT:    packh a1, a1, a2
-; RV32VB-PACK-NEXT:    slli a3, a3, 16
-; RV32VB-PACK-NEXT:    slli a4, a4, 24
-; RV32VB-PACK-NEXT:    or a3, a4, a3
+; RV32VB-PACK-NEXT:    packh a2, a3, a4
+; RV32VB-PACK-NEXT:    pack a1, a1, a2
 ; RV32VB-PACK-NEXT:    lbu a2, 44(a0)
-; RV32VB-PACK-NEXT:    lbu a4, 55(a0)
-; RV32VB-PACK-NEXT:    lbu a5, 623(a0)
-; RV32VB-PACK-NEXT:    lbu a6, 75(a0)
-; RV32VB-PACK-NEXT:    or a1, a1, a3
-; RV32VB-PACK-NEXT:    packh a2, a2, a4
-; RV32VB-PACK-NEXT:    slli a5, a5, 16
-; RV32VB-PACK-NEXT:    slli a6, a6, 24
-; RV32VB-PACK-NEXT:    or a3, a6, a5
-; RV32VB-PACK-NEXT:    lbu a4, 82(a0)
-; RV32VB-PACK-NEXT:    lbu a5, 93(a0)
-; RV32VB-PACK-NEXT:    lbu a6, 105(a0)
-; RV32VB-PACK-NEXT:    lbu a7, 161(a0)
-; RV32VB-PACK-NEXT:    or a2, a2, a3
+; RV32VB-PACK-NEXT:    lbu a3, 55(a0)
+; RV32VB-PACK-NEXT:    lbu a4, 623(a0)
+; RV32VB-PACK-NEXT:    lbu a5, 75(a0)
+; RV32VB-PACK-NEXT:    lbu a6, 82(a0)
+; RV32VB-PACK-NEXT:    lbu a7, 93(a0)
+; RV32VB-PACK-NEXT:    packh a2, a2, a3
 ; RV32VB-PACK-NEXT:    packh a3, a4, a5
-; RV32VB-PACK-NEXT:    slli a6, a6, 16
-; RV32VB-PACK-NEXT:    slli a7, a7, 24
-; RV32VB-PACK-NEXT:    or a4, a7, a6
-; RV32VB-PACK-NEXT:    lbu a5, 124(a0)
-; RV32VB-PACK-NEXT:    lbu a6, 163(a0)
-; RV32VB-PACK-NEXT:    lbu a7, 144(a0)
+; RV32VB-PACK-NEXT:    pack a2, a2, a3
+; RV32VB-PACK-NEXT:    packh a3, a6, a7
+; RV32VB-PACK-NEXT:    lbu a4, 105(a0)
+; RV32VB-PACK-NEXT:    lbu a5, 161(a0)
+; RV32VB-PACK-NEXT:    lbu a6, 124(a0)
+; RV32VB-PACK-NEXT:    lbu a7, 163(a0)
+; RV32VB-PACK-NEXT:    lbu t0, 144(a0)
 ; RV32VB-PACK-NEXT:    lbu a0, 154(a0)
-; RV32VB-PACK-NEXT:    or a3, a3, a4
-; RV32VB-PACK-NEXT:    packh a4, a5, a6
-; RV32VB-PACK-NEXT:    slli a7, a7, 16
-; RV32VB-PACK-NEXT:    slli a0, a0, 24
-; RV32VB-PACK-NEXT:    or a0, a0, a7
-; RV32VB-PACK-NEXT:    or a0, a4, a0
+; RV32VB-PACK-NEXT:    packh a4, a4, a5
+; RV32VB-PACK-NEXT:    pack a3, a3, a4
+; RV32VB-PACK-NEXT:    packh a4, a6, a7
+; RV32VB-PACK-NEXT:    packh a0, t0, a0
+; RV32VB-PACK-NEXT:    pack a0, a4, a0
 ; RV32VB-PACK-NEXT:    vsetivli zero, 4, e32, m1, ta, ma
 ; RV32VB-PACK-NEXT:    vmv.v.x v8, a1
 ; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, a2
@@ -1790,45 +1762,33 @@ define <16 x i8> @buildvec_v16i8_loads_gather(ptr %p) {
 ; RVA22U64-PACK-NEXT:    lbu a3, 22(a0)
 ; RVA22U64-PACK-NEXT:    lbu a4, 31(a0)
 ; RVA22U64-PACK-NEXT:    packh a1, a1, a2
-; RVA22U64-PACK-NEXT:    slli a3, a3, 16
-; RVA22U64-PACK-NEXT:    slli a4, a4, 24
-; RVA22U64-PACK-NEXT:    or a3, a3, a4
-; RVA22U64-PACK-NEXT:    lbu a2, 44(a0)
-; RVA22U64-PACK-NEXT:    or a6, a1, a3
-; RVA22U64-PACK-NEXT:    lbu a3, 55(a0)
-; RVA22U64-PACK-NEXT:    lbu a4, 623(a0)
-; RVA22U64-PACK-NEXT:    slli a2, a2, 32
+; RVA22U64-PACK-NEXT:    packh a2, a3, a4
+; RVA22U64-PACK-NEXT:    lbu a3, 44(a0)
+; RVA22U64-PACK-NEXT:    lbu a4, 55(a0)
+; RVA22U64-PACK-NEXT:    packw a6, a1, a2
+; RVA22U64-PACK-NEXT:    lbu a2, 623(a0)
 ; RVA22U64-PACK-NEXT:    lbu a5, 75(a0)
-; RVA22U64-PACK-NEXT:    slli a3, a3, 40
-; RVA22U64-PACK-NEXT:    or a2, a2, a3
-; RVA22U64-PACK-NEXT:    slli a4, a4, 48
-; RVA22U64-PACK-NEXT:    slli a5, a5, 56
-; RVA22U64-PACK-NEXT:    or a4, a4, a5
-; RVA22U64-PACK-NEXT:    or a2, a2, a4
-; RVA22U64-PACK-NEXT:    lbu a3, 82(a0)
-; RVA22U64-PACK-NEXT:    lbu a4, 93(a0)
-; RVA22U64-PACK-NEXT:    lbu a5, 105(a0)
-; RVA22U64-PACK-NEXT:    lbu a1, 161(a0)
-; RVA22U64-PACK-NEXT:    or a2, a6, a2
 ; RVA22U64-PACK-NEXT:    packh a3, a3, a4
-; RVA22U64-PACK-NEXT:    slli a5, a5, 16
-; RVA22U64-PACK-NEXT:    slli a1, a1, 24
-; RVA22U64-PACK-NEXT:    or a1, a1, a5
-; RVA22U64-PACK-NEXT:    lbu a4, 124(a0)
-; RVA22U64-PACK-NEXT:    or a1, a1, a3
-; RVA22U64-PACK-NEXT:    lbu a3, 163(a0)
-; RVA22U64-PACK-NEXT:    lbu a5, 144(a0)
-; RVA22U64-PACK-NEXT:    slli a4, a4, 32
+; RVA22U64-PACK-NEXT:    lbu a4, 82(a0)
+; RVA22U64-PACK-NEXT:    lbu a1, 93(a0)
+; RVA22U64-PACK-NEXT:    packh a2, a2, a5
+; RVA22U64-PACK-NEXT:    packw a2, a3, a2
+; RVA22U64-PACK-NEXT:    pack a6, a6, a2
+; RVA22U64-PACK-NEXT:    packh a7, a4, a1
+; RVA22U64-PACK-NEXT:    lbu a3, 105(a0)
+; RVA22U64-PACK-NEXT:    lbu a4, 161(a0)
+; RVA22U64-PACK-NEXT:    lbu a5, 124(a0)
+; RVA22U64-PACK-NEXT:    lbu a2, 163(a0)
+; RVA22U64-PACK-NEXT:    lbu a1, 144(a0)
 ; RVA22U64-PACK-NEXT:    lbu a0, 154(a0)
-; RVA22U64-PACK-NEXT:    slli a3, a3, 40
-; RVA22U64-PACK-NEXT:    or a3, a3, a4
-; RVA22U64-PACK-NEXT:    slli a5, a5, 48
-; RVA22U64-PACK-NEXT:    slli a0, a0, 56
-; RVA22U64-PACK-NEXT:    or a0, a0, a5
-; RVA22U64-PACK-NEXT:    or a0, a0, a3
-; RVA22U64-PACK-NEXT:    or a0, a0, a1
+; RVA22U64-PACK-NEXT:    packh a3, a3, a4
+; RVA22U64-PACK-NEXT:    packw a3, a7, a3
+; RVA22U64-PACK-NEXT:    packh a2, a5, a2
+; RVA22U64-PACK-NEXT:    packh a0, a1, a0
+; RVA22U64-PACK-NEXT:    packw a0, a2, a0
+; RVA22U64-PACK-NEXT:    pack a0, a3, a0
 ; RVA22U64-PACK-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
-; RVA22U64-PACK-NEXT:    vmv.v.x v8, a2
+; RVA22U64-PACK-NEXT:    vmv.v.x v8, a6
 ; RVA22U64-PACK-NEXT:    vslide1down.vx v8, v8, a0
 ; RVA22U64-PACK-NEXT:    ret
 ;
@@ -1979,25 +1939,23 @@ define <16 x i8> @buildvec_v16i8_undef_low_half(ptr %p) {
 ; RV32VB-PACK:       # %bb.0:
 ; RV32VB-PACK-NEXT:    lbu a1, 82(a0)
 ; RV32VB-PACK-NEXT:    lbu a2, 93(a0)
-; RV32VB-PACK-NEXT:    lbu a3, 105(a0)
-; RV32VB-PACK-NEXT:    lbu a4, 161(a0)
 ; RV32VB-PACK-NEXT:    packh a1, a1, a2
-; RV32VB-PACK-NEXT:    slli a3, a3, 16
-; RV32VB-PACK-NEXT:    slli a4, a4, 24
-; RV32VB-PACK-NEXT:    or a3, a4, a3
-; RV32VB-PACK-NEXT:    lbu a2, 124(a0)
-; RV32VB-PACK-NEXT:    lbu a4, 163(a0)
-; RV32VB-PACK-NEXT:    lbu a5, 144(a0)
+; RV32VB-PACK-NEXT:    lbu a2, 105(a0)
+; RV32VB-PACK-NEXT:    lbu a3, 161(a0)
+; RV32VB-PACK-NEXT:    lbu a4, 124(a0)
+; RV32VB-PACK-NEXT:    lbu a5, 163(a0)
+; RV32VB-PACK-NEXT:    lbu a6, 144(a0)
 ; RV32VB-PACK-NEXT:    lbu a0, 154(a0)
-; RV32VB-PACK-NEXT:    or a1, a1, a3
-; RV32VB-PACK-NEXT:    packh a2, a2, a4
-; RV32VB-PACK-NEXT:    slli a5, a5, 16
-; RV32VB-PACK-NEXT:    slli a0, a0, 24
-; RV32VB-PACK-NEXT:    or a0, a0, a5
-; RV32VB-PACK-NEXT:    or a0, a2, a0
+; RV32VB-PACK-NEXT:    packh a2, a2, a3
+; RV32VB-PACK-NEXT:    pack a1, a1, a2
+; RV32VB-PACK-NEXT:    packh a2, a4, a5
+; RV32VB-PACK-NEXT:    packh a0, a6, a0
+; RV32VB-PACK-NEXT:    pack a0, a2, a0
+; RV32VB-PACK-NEXT:    packh a2, a0, a0
+; RV32VB-PACK-NEXT:    pack a2, a2, a2
 ; RV32VB-PACK-NEXT:    vsetivli zero, 4, e32, m1, ta, ma
-; RV32VB-PACK-NEXT:    vmv.v.i v8, 0
-; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, zero
+; RV32VB-PACK-NEXT:    vmv.v.x v8, a2
+; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, a2
 ; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, a1
 ; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, a0
 ; RV32VB-PACK-NEXT:    ret
@@ -2056,27 +2014,24 @@ define <16 x i8> @buildvec_v16i8_undef_low_half(ptr %p) {
 ; RVA22U64-PACK:       # %bb.0:
 ; RVA22U64-PACK-NEXT:    lbu a1, 82(a0)
 ; RVA22U64-PACK-NEXT:    lbu a2, 93(a0)
-; RVA22U64-PACK-NEXT:    lbu a3, 105(a0)
-; RVA22U64-PACK-NEXT:    lbu a4, 161(a0)
-; RVA22U64-PACK-NEXT:    packh a1, a1, a2
-; RVA22U64-PACK-NEXT:    slli a3, a3, 16
-; RVA22U64-PACK-NEXT:    slli a4, a4, 24
-; RVA22U64-PACK-NEXT:    or a3, a3, a4
-; RVA22U64-PACK-NEXT:    lbu a2, 124(a0)
-; RVA22U64-PACK-NEXT:    or a1, a1, a3
-; RVA22U64-PACK-NEXT:    lbu a3, 163(a0)
-; RVA22U64-PACK-NEXT:    lbu a4, 144(a0)
-; RVA22U64-PACK-NEXT:    slli a2, a2, 32
+; RVA22U64-PACK-NEXT:    packh a6, a1, a2
+; RVA22U64-PACK-NEXT:    lbu a2, 105(a0)
+; RVA22U64-PACK-NEXT:    lbu a3, 161(a0)
+; RVA22U64-PACK-NEXT:    lbu a4, 124(a0)
+; RVA22U64-PACK-NEXT:    lbu a5, 163(a0)
+; RVA22U64-PACK-NEXT:    lbu a1, 144(a0)
 ; RVA22U64-PACK-NEXT:    lbu a0, 154(a0)
-; RVA22U64-PACK-NEXT:    slli a3, a3, 40
-; RVA22U64-PACK-NEXT:    or a2, a2, a3
-; RVA22U64-PACK-NEXT:    slli a4, a4, 48
-; RVA22U64-PACK-NEXT:    slli a0, a0, 56
-; RVA22U64-PACK-NEXT:    or a0, a0, a4
-; RVA22U64-PACK-NEXT:    or a0, a0, a2
-; RVA22U64-PACK-NEXT:    or a0, a0, a1
+; RVA22U64-PACK-NEXT:    packh a2, a2, a3
+; RVA22U64-PACK-NEXT:    packw a2, a6, a2
+; RVA22U64-PACK-NEXT:    packh a3, a4, a5
+; RVA22U64-PACK-NEXT:    packh a0, a1, a0
+; RVA22U64-PACK-NEXT:    packw a0, a3, a0
+; RVA22U64-PACK-NEXT:    pack a0, a2, a0
+; RVA22U64-PACK-NEXT:    packh a1, a0, a0
+; RVA22U64-PACK-NEXT:    packw a1, a1, a1
+; RVA22U64-PACK-NEXT:    pack a1, a1, a1
 ; RVA22U64-PACK-NEXT:    vsetivli zero, 2, e64, m1, ta, ma
-; RVA22U64-PACK-NEXT:    vmv.v.i v8, 0
+; RVA22U64-PACK-NEXT:    vmv.v.x v8, a1
 ; RVA22U64-PACK-NEXT:    vslide1down.vx v8, v8, a0
 ; RVA22U64-PACK-NEXT:    ret
 ;
@@ -2184,27 +2139,25 @@ define <16 x i8> @buildvec_v16i8_undef_high_half(ptr %p) {
 ; RV32VB-PACK:       # %bb.0:
 ; RV32VB-PACK-NEXT:    lbu a1, 0(a0)
 ; RV32VB-PACK-NEXT:    lbu a2, 1(a0)
-; RV32VB-PACK-NEXT:    lbu a3, 22(a0)
-; RV32VB-PACK-NEXT:    lbu a4, 31(a0)
 ; RV32VB-PACK-NEXT:    packh a1, a1, a2
-; RV32VB-PACK-NEXT:    slli a3, a3, 16
-; RV32VB-PACK-NEXT:    slli a4, a4, 24
-; RV32VB-PACK-NEXT:    or a3, a4, a3
-; RV32VB-PACK-NEXT:    lbu a2, 44(a0)
-; RV32VB-PACK-NEXT:    lbu a4, 55(a0)
-; RV32VB-PACK-NEXT:    lbu a5, 623(a0)
+; RV32VB-PACK-NEXT:    lbu a2, 22(a0)
+; RV32VB-PACK-NEXT:    lbu a3, 31(a0)
+; RV32VB-PACK-NEXT:    lbu a4, 44(a0)
+; RV32VB-PACK-NEXT:    lbu a5, 55(a0)
+; RV32VB-PACK-NEXT:    lbu a6, 623(a0)
 ; RV32VB-PACK-NEXT:    lbu a0, 75(a0)
-; RV32VB-PACK-NEXT:    or a1, a1, a3
-; RV32VB-PACK-NEXT:    packh a2, a2, a4
-; RV32VB-PACK-NEXT:    slli a5, a5, 16
-; RV32VB-PACK-NEXT:    slli a0, a0, 24
-; RV32VB-PACK-NEXT:    or a0, a0, a5
-; RV32VB-PACK-NEXT:    or a0, a2, a0
+; RV32VB-PACK-NEXT:    packh a2, a2, a3
+; RV32VB-PACK-NEXT:    pack a1, a1, a2
+; RV32VB-PACK-NEXT:    packh a2, a4, a5
+; RV32VB-PACK-NEXT:    packh a0, a6, a0
+; RV32VB-PACK-NEXT:    pack a0, a2, a0
 ; RV32VB-PACK-NEXT:    vsetivli zero, 4, e32, m1, ta, ma
 ; RV32VB-PACK-NEXT:    vmv.v.x v8, a1
 ; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, a0
-; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, zero
-; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, zero
+; RV32VB-PACK-NEXT:    packh a0, a0, a0
+; RV32VB-PACK-NEXT:    pack a0, a0, a0
+; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, a0
+; RV32VB-PACK-NEXT:    vslide1down.vx v8, v8, a0
 ; RV32VB-PACK-NEXT:    ret
 ;
 ; RV64V-ONLY-LABEL: buildvec_v16i8_undef_high_half:
@@ -2261,28 +2214,25 @@ define <16 x i8> @buildvec_v16i8_undef_high_half(ptr %p) {
 ; RVA22U64-PACK:       # %bb.0:
 ; RVA22U64-PACK-NEXT:    lbu a1, 0(a0)
 ; RVA22U64-PACK-NEXT:    lbu a2, 1(a0)
-; RVA22U64-PACK-NEXT:    lbu a3, 22(a0)
-; RVA22U64-PACK-NEXT:    lbu a4, 31(a0)
-; RVA22U64-PACK-NEXT:    packh a1, a1, a2
-; RVA22U64-PACK-NEXT:    slli a3, a3, 16
-; RVA22U64-PACK-NEXT:    slli a4, a4, 24
-; RVA22U64-PACK-NEXT:    or a3, a3, a4
-; RVA22U64-PACK-NEXT:    lbu a2, 44(a0)
-; RVA22U64-PACK-NEXT:    or a1, a1, a3
-; RVA22U64-PACK-NEXT:    lbu a3, 55(a0)
-; RVA22U64-PACK-NEXT:    lbu a4, 623(a0)
-; RVA22U64-PACK-NEXT:    slli a2, a2, 32
+; RVA22U64-PACK-NEXT:    packh a6, a1, a2
+; RVA22U64-PACK-NEXT:    lbu a2, 22(a0)
+; RVA22U64-PACK-NEXT:    lbu a3, 31(a0)
+; RVA22U64-PACK-NEXT:    lbu a4, 44(a0)
+; RVA22U64-PACK-NEXT:    lbu a5, 55(a0)
+; RVA22U64-PACK-NEXT:    lbu a1, 623(a0)
 ; RVA22U64-PACK-NEXT:    lbu a0, 75(a0)
-; RVA22U64-PACK-NEXT:    slli a3, a3, 40
-; RVA22U64-PACK-NEXT:    or a2, a2, a3
-; RVA22U64-PACK-NEXT:    slli a4, a4, 48
-; RVA22U64-PACK-NEXT:    slli a0, a0, 56
-; RVA22U64-PACK-NEXT:    or a...
[truncated]

SDValue ShtAmt = DAG.getConstant(ElemSizeInBits, ElemDL, XLenVT);
SDNodeFlags Flags;
Flags.setDisjoint(true);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, the disjoint change here is arguably unrelated. I can separate this out if folks want. I did it in one change because I realized it when playing with pack* matching options. By it's own, this line has no codegen effect.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with it.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@preames preames merged commit c959357 into llvm:main Jul 8, 2024
7 of 8 checks passed
@preames preames deleted the pr-build-vector-packing-via-zbkb branch July 8, 2024 23:10
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 9, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-fast running on sanitizer-buildbot4 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/169/builds/804

Here is the relevant piece of the build log for the reference:

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using lld-link: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld64.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using wasm-ld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using lld-link: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld64.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using wasm-ld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 83824 of 83825 tests, 80 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test (79976 of 83824)
******************** TEST 'LLVM :: tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 4: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-dwarfutil --garbage-collection --tombstone=maxpc /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-dwarfutil/ELF/X86/Output/warning-skipped-types.test.tmp1 2>&1 | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test -DFILE=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-dwarfutil --garbage-collection --tombstone=maxpc /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-dwarfutil/ELF/X86/Output/warning-skipped-types.test.tmp1
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test -DFILE=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: tools/llvm-objcopy/ELF/remove-section-in-group.test (82523 of 83824)
******************** TEST 'LLVM :: tools/llvm-objcopy/ELF/remove-section-in-group.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/yaml2obj --docnum=1 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-objcopy -R .foo - -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/obj2yaml -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/yaml2obj --docnum=1 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o -
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/obj2yaml -
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-objcopy -R .foo - -
RUN: at line 39: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/yaml2obj --docnum=2 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/yaml2obj --docnum=2 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
RUN: at line 40: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-objcopy --remove-section=.debug_macro /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-objcopy --remove-section=.debug_macro /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
=================================================================
==2586793==ERROR: AddressSanitizer: heap-use-after-free on address 0x512000000370 at pc 0x555faae8172e bp 0x7ffc748d5dd0 sp 0x7ffc748d5dc8
READ of size 4 at 0x512000000370 thread T0
    #0 0x555faae8172d in llvm::objcopy::elf::Symbol::getShndx() const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:689:20
    #1 0x555faae817a8 in llvm::objcopy::elf::Symbol::isCommon() const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:706:40
    #2 0x555faae70c38 in operator() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp:317:14
    #3 0x555faae70c38 in void llvm::function_ref<void (llvm::objcopy::elf::Symbol&)>::callback_fn<updateAndRemoveSymbols(llvm::objcopy::CommonConfig const&, llvm::objcopy::ELFConfig const&, llvm::objcopy::elf::Object&)::$_0>(long, llvm::objcopy::elf::Symbol&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
    #4 0x555faae82810 in operator() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
    #5 0x555faae82810 in llvm::objcopy::elf::SymbolTableSection::updateSymbols(llvm::function_ref<void (llvm::objcopy::elf::Symbol&)>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:761:5
    #6 0x555faae642f8 in updateAndRemoveSymbols /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp:311:20
Step 9 (stage2/asan_ubsan check) failure: stage2/asan_ubsan check (failure)
...
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using lld-link: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld64.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using wasm-ld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using lld-link: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld64.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using wasm-ld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 83824 of 83825 tests, 80 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test (79976 of 83824)
******************** TEST 'LLVM :: tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 4: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-dwarfutil --garbage-collection --tombstone=maxpc /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-dwarfutil/ELF/X86/Output/warning-skipped-types.test.tmp1 2>&1 | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test -DFILE=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-dwarfutil --garbage-collection --tombstone=maxpc /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-dwarfutil/ELF/X86/Output/warning-skipped-types.test.tmp1
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test -DFILE=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: tools/llvm-objcopy/ELF/remove-section-in-group.test (82523 of 83824)
******************** TEST 'LLVM :: tools/llvm-objcopy/ELF/remove-section-in-group.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/yaml2obj --docnum=1 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-objcopy -R .foo - -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/obj2yaml -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/yaml2obj --docnum=1 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o -
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/obj2yaml -
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-objcopy -R .foo - -
RUN: at line 39: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/yaml2obj --docnum=2 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/yaml2obj --docnum=2 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
RUN: at line 40: /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-objcopy --remove-section=.debug_macro /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-objcopy --remove-section=.debug_macro /b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
=================================================================
==2586793==ERROR: AddressSanitizer: heap-use-after-free on address 0x512000000370 at pc 0x555faae8172e bp 0x7ffc748d5dd0 sp 0x7ffc748d5dc8
READ of size 4 at 0x512000000370 thread T0
    #0 0x555faae8172d in llvm::objcopy::elf::Symbol::getShndx() const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:689:20
    #1 0x555faae817a8 in llvm::objcopy::elf::Symbol::isCommon() const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:706:40
    #2 0x555faae70c38 in operator() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp:317:14
    #3 0x555faae70c38 in void llvm::function_ref<void (llvm::objcopy::elf::Symbol&)>::callback_fn<updateAndRemoveSymbols(llvm::objcopy::CommonConfig const&, llvm::objcopy::ELFConfig const&, llvm::objcopy::elf::Object&)::$_0>(long, llvm::objcopy::elf::Symbol&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
    #4 0x555faae82810 in operator() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
    #5 0x555faae82810 in llvm::objcopy::elf::SymbolTableSection::updateSymbols(llvm::function_ref<void (llvm::objcopy::elf::Symbol&)>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:761:5
    #6 0x555faae642f8 in updateAndRemoveSymbols /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp:311:20
Step 12 (stage2/msan check) failure: stage2/msan check (failure)
...
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using lld-link: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld-link
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld64.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld64.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using wasm-ld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/wasm-ld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using lld-link: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/lld-link
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using ld64.lld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/ld64.lld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:508: note: using wasm-ld: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/wasm-ld
llvm-lit: /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 83823 tests, 80 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test (58102 of 83823)
******************** TEST 'LLVM :: tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 4: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfutil --garbage-collection --tombstone=maxpc /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/tools/llvm-dwarfutil/ELF/X86/Output/warning-skipped-types.test.tmp1 2>&1 | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test -DFILE=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/warning-skipped-types.test -DFILE=/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfutil --garbage-collection --tombstone=maxpc /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-dwarfutil/ELF/X86/Inputs/type-units.o /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/tools/llvm-dwarfutil/ELF/X86/Output/warning-skipped-types.test.tmp1

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
FAIL: LLVM :: tools/llvm-objcopy/ELF/remove-section-in-group.test (72549 of 83823)
******************** TEST 'LLVM :: tools/llvm-objcopy/ELF/remove-section-in-group.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 3: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/yaml2obj --docnum=1 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-objcopy -R .foo - -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/obj2yaml -    | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/obj2yaml -
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-objcopy -R .foo - -
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/yaml2obj --docnum=1 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o -
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test
RUN: at line 39: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/yaml2obj --docnum=2 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/yaml2obj --docnum=2 /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/tools/llvm-objcopy/ELF/remove-section-in-group.test -o /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
RUN: at line 40: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-objcopy --remove-section=.debug_macro /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
+ /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-objcopy --remove-section=.debug_macro /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/test/tools/llvm-objcopy/ELF/Output/remove-section-in-group.test.tmp
==724056==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x564c0142f4d5 in llvm::objcopy::elf::Symbol::getShndx() const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:704:1
    #1 0x564c0142f53e in llvm::objcopy::elf::Symbol::isCommon() const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:706:40
    #2 0x564c01424908 in operator() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp:317:14
    #3 0x564c01424908 in void llvm::function_ref<void (llvm::objcopy::elf::Symbol&)>::callback_fn<updateAndRemoveSymbols(llvm::objcopy::CommonConfig const&, llvm::objcopy::ELFConfig const&, llvm::objcopy::elf::Object&)::$_0>(long, llvm::objcopy::elf::Symbol&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12
    #4 0x564c0142feb8 in operator() /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12
    #5 0x564c0142feb8 in llvm::objcopy::elf::SymbolTableSection::updateSymbols(llvm::function_ref<void (llvm::objcopy::elf::Symbol&)>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObject.cpp:761:5
    #6 0x564c0141e049 in updateAndRemoveSymbols /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp:311:20
    #7 0x564c0141e049 in handleArgs(llvm::objcopy::CommonConfig const&, llvm::objcopy::ELFConfig const&, llvm::objcopy::elf::Object&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp:662:17
    #8 0x564c014210c6 in llvm::objcopy::elf::executeObjcopyOnBinary(llvm::objcopy::CommonConfig const&, llvm::objcopy::ELFConfig const&, llvm::object::ELFObjectFileBase&, llvm::raw_ostream&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp:880:17

aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
In 03d4332, we extended build_vector lowering to pack elements into the
largest size which doesn't exceed either ELEN or XLEN. The zbkb
extension - ratified under scalar crypto, but otherwise not really
connected to crypto per se - adds the packh, packw, and pack
instructions. These instructions are designed for exactly this pairwise
packing.

I ended up choosing to directly lower to machine nodes. A combination of
the slightly non-uniform semantics of these instructions (packw *sign*
extends the result, whereas packh *zero* extends it), and our generic
dag canonicalization (which sinks shl through or nodes), make pattern
matching these tricky and not particularly robust. Another alternative
was to have an ISD node for them, but that didn't seem to add much in
practice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants