[AMDGPU] Promote uniform ops to I32 in DAGISel #106383

Pierre-vh · 2024-08-28T12:54:53Z

Promote uniform binops, selects and setcc between 2 and 16 bits to 32 bits in DAGISel

Solves #64591

llvmbot · 2024-08-28T12:55:30Z

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-selectiondag

Author: Pierre van Houtryve (Pierre-vh)

Changes

See #106382 for NFC test updates.

Promote uniform binops, selects and setcc in Global & DAGISel instead of CGP.

Solves #64591

Patch is 1.35 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106383.diff

88 Files Affected:

(modified) llvm/include/llvm/CodeGen/TargetLowering.h (+1-1)
(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+10-9)
(modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+6-4)
(modified) llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp (+4-4)
(modified) llvm/lib/Target/AMDGPU/AMDGPUCombine.td (+27-1)
(modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+33-2)
(modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h (+1-1)
(modified) llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp (+113)
(modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+149-7)
(modified) llvm/lib/Target/AMDGPU/SIISelLowering.h (+1-1)
(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+2-1)
(modified) llvm/lib/Target/X86/X86ISelLowering.h (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll (+33-37)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/andn2.ll (+60-54)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll (+100-63)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fshl.ll (+72-48)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fshr.ll (+78-52)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.fmas.ll (+442-412)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll (+107-42)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll (+15-62)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/orn2.ll (+60-54)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sext_inreg.ll (+68-101)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll (+6-4)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll (+49-39)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sub.v2i16.ll (+25-29)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/xnor.ll (+4-22)
(modified) llvm/test/CodeGen/AMDGPU/add.v2i16.ll (+11-11)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll (+3-4)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-i16-to-i32.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-pow-codegen.ll (+2-650)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu.private-memory.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/anyext.ll (+2-6)
(modified) llvm/test/CodeGen/AMDGPU/bitreverse.ll (+2-5)
(modified) llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll (+2-62)
(modified) llvm/test/CodeGen/AMDGPU/calling-conventions.ll (+900-839)
(modified) llvm/test/CodeGen/AMDGPU/cgp-bitfield-extract.ll (+4-7)
(modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/ctlz.ll (+5-21)
(modified) llvm/test/CodeGen/AMDGPU/ctlz_zero_undef.ll (+14-11)
(modified) llvm/test/CodeGen/AMDGPU/cttz.ll (+3-11)
(modified) llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll (+14-24)
(modified) llvm/test/CodeGen/AMDGPU/dagcombine-select.ll (+2-3)
(modified) llvm/test/CodeGen/AMDGPU/extract_vector_dynelt.ll (+1010-309)
(modified) llvm/test/CodeGen/AMDGPU/extract_vector_elt-i8.ll (+539-119)
(modified) llvm/test/CodeGen/AMDGPU/fcopysign.f16.ll (+51-50)
(modified) llvm/test/CodeGen/AMDGPU/fneg.ll (+3-10)
(modified) llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll (-532)
(modified) llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll (+21-12)
(modified) llvm/test/CodeGen/AMDGPU/idiv-licm.ll (+235-228)
(modified) llvm/test/CodeGen/AMDGPU/imm16.ll (+9-9)
(modified) llvm/test/CodeGen/AMDGPU/insert-delay-alu-bug.ll (+57-49)
(modified) llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll (+910-993)
(modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll (+74-86)
(modified) llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.bf16.ll (+20-20)
(modified) llvm/test/CodeGen/AMDGPU/load-constant-i1.ll (+3212-3431)
(modified) llvm/test/CodeGen/AMDGPU/load-constant-i8.ll (+2431-2404)
(modified) llvm/test/CodeGen/AMDGPU/load-global-i8.ll (+5-10)
(modified) llvm/test/CodeGen/AMDGPU/load-local-i8.ll (+5-10)
(modified) llvm/test/CodeGen/AMDGPU/lower-lds-struct-aa-memcpy.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/lshr.v2i16.ll (+7-7)
(modified) llvm/test/CodeGen/AMDGPU/min.ll (+225-170)
(modified) llvm/test/CodeGen/AMDGPU/mul.ll (+27-24)
(modified) llvm/test/CodeGen/AMDGPU/permute_i8.ll (+44-29)
(modified) llvm/test/CodeGen/AMDGPU/preload-kernargs.ll (+108-119)
(modified) llvm/test/CodeGen/AMDGPU/scalar_to_vector.ll (+21-14)
(modified) llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll (+100-99)
(modified) llvm/test/CodeGen/AMDGPU/select-i1.ll (+4-9)
(modified) llvm/test/CodeGen/AMDGPU/select-vectors.ll (+2-3)
(modified) llvm/test/CodeGen/AMDGPU/setcc-opt.ll (+5-12)
(modified) llvm/test/CodeGen/AMDGPU/sext-in-reg.ll (+4-10)
(modified) llvm/test/CodeGen/AMDGPU/shl.ll (+3-2)
(modified) llvm/test/CodeGen/AMDGPU/shl.v2i16.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/sign_extend.ll (+9-10)
(modified) llvm/test/CodeGen/AMDGPU/smed3.ll (+17-3)
(modified) llvm/test/CodeGen/AMDGPU/sminmax.v2i16.ll (+1013-83)
(modified) llvm/test/CodeGen/AMDGPU/sra.ll (+40-40)
(modified) llvm/test/CodeGen/AMDGPU/srem.ll (+19-17)
(modified) llvm/test/CodeGen/AMDGPU/sub.v2i16.ll (+16-18)
(modified) llvm/test/CodeGen/AMDGPU/trunc-combine.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/trunc-store.ll (+80-56)
(modified) llvm/test/CodeGen/AMDGPU/uaddo.ll (+9-6)
(modified) llvm/test/CodeGen/AMDGPU/usubo.ll (+9-6)
(modified) llvm/test/CodeGen/AMDGPU/v_sat_pk_u8_i16.ll (+11-10)
(modified) llvm/test/CodeGen/AMDGPU/vector-alloca-bitcast.ll (+1-2)
(modified) llvm/test/CodeGen/AMDGPU/vgpr-spill-placement-issue61083.ll (+4-2)
(modified) llvm/test/CodeGen/AMDGPU/widen-smrd-loads.ll (+35-26)
(modified) llvm/test/CodeGen/AMDGPU/zero_extend.ll (+6-5)

diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index eda38cd8a564d6..85310a4911b8ed 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -3299,7 +3299,7 @@ class TargetLoweringBase {
   /// Return true if it's profitable to narrow operations of type SrcVT to
   /// DestVT. e.g. on x86, it's profitable to narrow from i32 to i8 but not from
   /// i32 to i16.
-  virtual bool isNarrowingProfitable(EVT SrcVT, EVT DestVT) const {
+  virtual bool isNarrowingProfitable(SDNode *N, EVT SrcVT, EVT DestVT) const {
     return false;
   }
 
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index b0a906743f29ff..513ad392cb360a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -7031,7 +7031,7 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
     if (N1C->getAPIntValue().countLeadingZeros() >= (BitWidth - SrcBitWidth) &&
         TLI.isTruncateFree(VT, SrcVT) && TLI.isZExtFree(SrcVT, VT) &&
         TLI.isTypeDesirableForOp(ISD::AND, SrcVT) &&
-        TLI.isNarrowingProfitable(VT, SrcVT))
+        TLI.isNarrowingProfitable(N, VT, SrcVT))
       return DAG.getNode(ISD::ZERO_EXTEND, DL, VT,
                          DAG.getNode(ISD::AND, DL, SrcVT, N0Op0,
                                      DAG.getZExtOrTrunc(N1, DL, SrcVT)));
@@ -14574,7 +14574,7 @@ SDValue DAGCombiner::reduceLoadWidth(SDNode *N) {
   // ShLeftAmt will indicate how much a narrowed load should be shifted left.
   unsigned ShLeftAmt = 0;
   if (ShAmt == 0 && N0.getOpcode() == ISD::SHL && N0.hasOneUse() &&
-      ExtVT == VT && TLI.isNarrowingProfitable(N0.getValueType(), VT)) {
+      ExtVT == VT && TLI.isNarrowingProfitable(N, N0.getValueType(), VT)) {
     if (ConstantSDNode *N01 = dyn_cast<ConstantSDNode>(N0.getOperand(1))) {
       ShLeftAmt = N01->getZExtValue();
       N0 = N0.getOperand(0);
@@ -15118,9 +15118,11 @@ SDValue DAGCombiner::visitTRUNCATE(SDNode *N) {
   }
 
   // trunc (select c, a, b) -> select c, (trunc a), (trunc b)
-  if (N0.getOpcode() == ISD::SELECT && N0.hasOneUse()) {
-    if ((!LegalOperations || TLI.isOperationLegal(ISD::SELECT, SrcVT)) &&
-        TLI.isTruncateFree(SrcVT, VT)) {
+  if (N0.getOpcode() == ISD::SELECT && N0.hasOneUse() &&
+      TLI.isTruncateFree(SrcVT, VT)) {
+    if (!LegalOperations ||
+        (TLI.isOperationLegal(ISD::SELECT, SrcVT) &&
+         TLI.isNarrowingProfitable(N0.getNode(), N0.getValueType(), VT))) {
       SDLoc SL(N0);
       SDValue Cond = N0.getOperand(0);
       SDValue TruncOp0 = DAG.getNode(ISD::TRUNCATE, SL, VT, N0.getOperand(1));
@@ -20061,10 +20063,9 @@ SDValue DAGCombiner::ReduceLoadOpStoreWidth(SDNode *N) {
     EVT NewVT = EVT::getIntegerVT(*DAG.getContext(), NewBW);
     // The narrowing should be profitable, the load/store operation should be
     // legal (or custom) and the store size should be equal to the NewVT width.
-    while (NewBW < BitWidth &&
-           (NewVT.getStoreSizeInBits() != NewBW ||
-            !TLI.isOperationLegalOrCustom(Opc, NewVT) ||
-            !TLI.isNarrowingProfitable(VT, NewVT))) {
+    while (NewBW < BitWidth && (NewVT.getStoreSizeInBits() != NewBW ||
+                                !TLI.isOperationLegalOrCustom(Opc, NewVT) ||
+                                !TLI.isNarrowingProfitable(N, VT, NewVT))) {
       NewBW = NextPowerOf2(NewBW);
       NewVT = EVT::getIntegerVT(*DAG.getContext(), NewBW);
     }
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 4e796289cff0a1..97e10b3551db1a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -1841,7 +1841,7 @@ bool TargetLowering::SimplifyDemandedBits(
         for (unsigned SmallVTBits = llvm::bit_ceil(DemandedSize);
              SmallVTBits < BitWidth; SmallVTBits = NextPowerOf2(SmallVTBits)) {
           EVT SmallVT = EVT::getIntegerVT(*TLO.DAG.getContext(), SmallVTBits);
-          if (isNarrowingProfitable(VT, SmallVT) &&
+          if (isNarrowingProfitable(Op.getNode(), VT, SmallVT) &&
               isTypeDesirableForOp(ISD::SHL, SmallVT) &&
               isTruncateFree(VT, SmallVT) && isZExtFree(SmallVT, VT) &&
               (!TLO.LegalOperations() || isOperationLegal(ISD::SHL, SmallVT))) {
@@ -1865,7 +1865,7 @@ bool TargetLowering::SimplifyDemandedBits(
       if ((BitWidth % 2) == 0 && !VT.isVector() && ShAmt < HalfWidth &&
           DemandedBits.countLeadingOnes() >= HalfWidth) {
         EVT HalfVT = EVT::getIntegerVT(*TLO.DAG.getContext(), HalfWidth);
-        if (isNarrowingProfitable(VT, HalfVT) &&
+        if (isNarrowingProfitable(Op.getNode(), VT, HalfVT) &&
             isTypeDesirableForOp(ISD::SHL, HalfVT) &&
             isTruncateFree(VT, HalfVT) && isZExtFree(HalfVT, VT) &&
             (!TLO.LegalOperations() || isOperationLegal(ISD::SHL, HalfVT))) {
@@ -1984,7 +1984,7 @@ bool TargetLowering::SimplifyDemandedBits(
       if ((BitWidth % 2) == 0 && !VT.isVector()) {
         APInt HiBits = APInt::getHighBitsSet(BitWidth, BitWidth / 2);
         EVT HalfVT = EVT::getIntegerVT(*TLO.DAG.getContext(), BitWidth / 2);
-        if (isNarrowingProfitable(VT, HalfVT) &&
+        if (isNarrowingProfitable(Op.getNode(), VT, HalfVT) &&
             isTypeDesirableForOp(ISD::SRL, HalfVT) &&
             isTruncateFree(VT, HalfVT) && isZExtFree(HalfVT, VT) &&
             (!TLO.LegalOperations() || isOperationLegal(ISD::SRL, HalfVT)) &&
@@ -4762,9 +4762,11 @@ SDValue TargetLowering::SimplifySetCC(EVT VT, SDValue N0, SDValue N1,
       case ISD::SETULT:
       case ISD::SETULE: {
         EVT newVT = N0.getOperand(0).getValueType();
+        // FIXME: Should use isNarrowingProfitable.
         if (DCI.isBeforeLegalizeOps() ||
             (isOperationLegal(ISD::SETCC, newVT) &&
-             isCondCodeLegal(Cond, newVT.getSimpleVT()))) {
+             isCondCodeLegal(Cond, newVT.getSimpleVT()) &&
+             isTypeDesirableForOp(ISD::SETCC, newVT))) {
           EVT NewSetCCVT = getSetCCResultType(Layout, *DAG.getContext(), newVT);
           SDValue NewConst = DAG.getConstant(C1.trunc(InSize), dl, newVT);
 
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
index 052e1140533f3f..f689fcf62fe8eb 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
@@ -46,10 +46,10 @@ static cl::opt<bool> WidenLoads(
   cl::init(false));
 
 static cl::opt<bool> Widen16BitOps(
-  "amdgpu-codegenprepare-widen-16-bit-ops",
-  cl::desc("Widen uniform 16-bit instructions to 32-bit in AMDGPUCodeGenPrepare"),
-  cl::ReallyHidden,
-  cl::init(true));
+    "amdgpu-codegenprepare-widen-16-bit-ops",
+    cl::desc(
+        "Widen uniform 16-bit instructions to 32-bit in AMDGPUCodeGenPrepare"),
+    cl::ReallyHidden, cl::init(false));
 
 static cl::opt<bool>
     BreakLargePHIs("amdgpu-codegenprepare-break-large-phis",
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
index b2a3f9392157d1..01e96159babd03 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
@@ -145,6 +145,31 @@ def expand_promoted_fmed3 : GICombineRule<
 
 } // End Predicates = [NotHasMed3_16]
 
+def promote_i16_uniform_binops_frag : GICombinePatFrag<
+  (outs root:$dst), (ins),
+  !foreach(op, [G_ADD, G_SUB, G_SHL, G_ASHR, G_LSHR, G_AND, G_XOR, G_OR, G_MUL],
+          (pattern (op i16:$dst, i16:$lhs, i16:$rhs)))>;
+
+def promote_i16_uniform_binops : GICombineRule<
+  (defs root:$dst),
+  (match (promote_i16_uniform_binops_frag i16:$dst):$mi,
+    [{ return matchPromote16to32(*${mi}); }]),
+  (apply [{ applyPromote16to32(*${mi}); }])
+>;
+
+def promote_i16_uniform_ternary_frag : GICombinePatFrag<
+  (outs root:$dst), (ins),
+  !foreach(op, [G_ICMP, G_SELECT],
+          (pattern (op i16:$dst, $first, i16:$lhs, i16:$rhs)))>;
+
+def promote_i16_uniform_ternary : GICombineRule<
+  (defs root:$dst),
+  (match (promote_i16_uniform_ternary_frag i16:$dst):$mi,
+    [{ return matchPromote16to32(*${mi}); }]),
+  (apply [{ applyPromote16to32(*${mi}); }])
+>;
+
+
 // Combines which should only apply on SI/CI
 def gfx6gfx7_combines : GICombineGroup<[fcmp_select_to_fmin_fmax_legacy]>;
 
@@ -169,5 +194,6 @@ def AMDGPURegBankCombiner : GICombiner<
   "AMDGPURegBankCombinerImpl",
   [unmerge_merge, unmerge_cst, unmerge_undef,
    zext_trunc_fold, int_minmax_to_med3, ptr_add_immed_chain,
-   fp_minmax_to_clamp, fp_minmax_to_med3, fmed3_intrinsic_to_clamp]> {
+   fp_minmax_to_clamp, fp_minmax_to_med3, fmed3_intrinsic_to_clamp,
+   promote_i16_uniform_binops, promote_i16_uniform_ternary]> {
 }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 96143d688801aa..1a596cc80c0c9c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -1017,14 +1017,45 @@ bool AMDGPUTargetLowering::isZExtFree(EVT Src, EVT Dest) const {
   return Src == MVT::i32 && Dest == MVT::i64;
 }
 
-bool AMDGPUTargetLowering::isNarrowingProfitable(EVT SrcVT, EVT DestVT) const {
+bool AMDGPUTargetLowering::isNarrowingProfitable(SDNode *N, EVT SrcVT,
+                                                 EVT DestVT) const {
+  switch (N->getOpcode()) {
+  case ISD::ADD:
+  case ISD::SUB:
+  case ISD::SHL:
+  case ISD::SRL:
+  case ISD::SRA:
+  case ISD::AND:
+  case ISD::OR:
+  case ISD::XOR:
+  case ISD::MUL:
+  case ISD::SETCC:
+  case ISD::SELECT:
+    if (Subtarget->has16BitInsts() &&
+        (DestVT.isVector() ? !Subtarget->hasVOP3PInsts() : true)) {
+      // Don't narrow back down to i16 if promoted to i32 already.
+      if (!N->isDivergent() && DestVT.isInteger() &&
+          DestVT.getScalarSizeInBits() > 1 &&
+          DestVT.getScalarSizeInBits() <= 16 &&
+          SrcVT.getScalarSizeInBits() > 16) {
+        return false;
+      }
+    }
+    return true;
+  default:
+    break;
+  }
+
   // There aren't really 64-bit registers, but pairs of 32-bit ones and only a
   // limited number of native 64-bit operations. Shrinking an operation to fit
   // in a single 32-bit register should always be helpful. As currently used,
   // this is much less general than the name suggests, and is only used in
   // places trying to reduce the sizes of loads. Shrinking loads to < 32-bits is
   // not profitable, and may actually be harmful.
-  return SrcVT.getSizeInBits() > 32 && DestVT.getSizeInBits() == 32;
+  if (isa<LoadSDNode>(N))
+    return SrcVT.getSizeInBits() > 32 && DestVT.getSizeInBits() == 32;
+
+  return true;
 }
 
 bool AMDGPUTargetLowering::isDesirableToCommuteWithShift(
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
index 59f640ea99de3e..4dfa7ac052a5ba 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
@@ -201,7 +201,7 @@ class AMDGPUTargetLowering : public TargetLowering {
                                NegatibleCost &Cost,
                                unsigned Depth) const override;
 
-  bool isNarrowingProfitable(EVT SrcVT, EVT DestVT) const override;
+  bool isNarrowingProfitable(SDNode *N, EVT SrcVT, EVT DestVT) const override;
 
   bool isDesirableToCommuteWithShift(const SDNode *N,
                                      CombineLevel Level) const override;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
index e236a5d7522e02..3b4faa35b93738 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
@@ -89,6 +89,9 @@ class AMDGPURegBankCombinerImpl : public Combiner {
   void applyMed3(MachineInstr &MI, Med3MatchInfo &MatchInfo) const;
   void applyClamp(MachineInstr &MI, Register &Reg) const;
 
+  bool matchPromote16to32(MachineInstr &MI) const;
+  void applyPromote16to32(MachineInstr &MI) const;
+
 private:
   SIModeRegisterDefaults getMode() const;
   bool getIEEE() const;
@@ -348,6 +351,116 @@ bool AMDGPURegBankCombinerImpl::matchFPMed3ToClamp(MachineInstr &MI,
   return false;
 }
 
+bool AMDGPURegBankCombinerImpl::matchPromote16to32(MachineInstr &MI) const {
+  Register Dst = MI.getOperand(0).getReg();
+  LLT DstTy = MRI.getType(Dst);
+  const auto *RB = MRI.getRegBankOrNull(Dst);
+
+  // Only promote uniform instructions.
+  if (RB->getID() != AMDGPU::SGPRRegBankID)
+    return false;
+
+  // Promote only if:
+  //    - We have 16 bit insts (not true 16 bit insts).
+  //    - We don't have packed instructions (for vector types only).
+  // TODO: For vector types, the set of packed operations is more limited, so
+  // may want to promote some anyway.
+  return STI.has16BitInsts() &&
+         (DstTy.isVector() ? !STI.hasVOP3PInsts() : true);
+}
+
+static unsigned getExtOpcodeForPromotedOp(MachineInstr &MI) {
+  switch (MI.getOpcode()) {
+  case AMDGPU::G_ASHR:
+    return AMDGPU::G_SEXT;
+  case AMDGPU::G_ADD:
+  case AMDGPU::G_SUB:
+  case AMDGPU::G_FSHR:
+    return AMDGPU::G_ZEXT;
+  case AMDGPU::G_AND:
+  case AMDGPU::G_OR:
+  case AMDGPU::G_XOR:
+  case AMDGPU::G_SHL:
+  case AMDGPU::G_SELECT:
+  case AMDGPU::G_MUL:
+    // operation result won't be influenced by garbage high bits.
+    // TODO: are all of those cases correct, and are there more?
+    return AMDGPU::G_ANYEXT;
+  case AMDGPU::G_ICMP: {
+    return CmpInst::isSigned(cast<GICmp>(MI).getCond()) ? AMDGPU::G_SEXT
+                                                        : AMDGPU::G_ZEXT;
+  }
+  default:
+    llvm_unreachable("unexpected opcode!");
+  }
+}
+
+void AMDGPURegBankCombinerImpl::applyPromote16to32(MachineInstr &MI) const {
+  const unsigned Opc = MI.getOpcode();
+  assert(Opc == AMDGPU::G_ADD || Opc == AMDGPU::G_SUB || Opc == AMDGPU::G_SHL ||
+         Opc == AMDGPU::G_LSHR || Opc == AMDGPU::G_ASHR ||
+         Opc == AMDGPU::G_AND || Opc == AMDGPU::G_OR || Opc == AMDGPU::G_XOR ||
+         Opc == AMDGPU::G_MUL || Opc == AMDGPU::G_SELECT ||
+         Opc == AMDGPU::G_ICMP);
+
+  Register Dst = MI.getOperand(0).getReg();
+
+  bool IsSelectOrCmp = (Opc == AMDGPU::G_SELECT || Opc == AMDGPU::G_ICMP);
+  Register LHS = MI.getOperand(IsSelectOrCmp + 1).getReg();
+  Register RHS = MI.getOperand(IsSelectOrCmp + 2).getReg();
+
+  assert(MRI.getType(Dst) == LLT::scalar(16));
+  assert(MRI.getType(LHS) == LLT::scalar(16));
+  assert(MRI.getType(RHS) == LLT::scalar(16));
+
+  assert(MRI.getRegBankOrNull(Dst)->getID() == AMDGPU::SGPRRegBankID);
+  assert(MRI.getRegBankOrNull(LHS)->getID() == AMDGPU::SGPRRegBankID);
+  assert(MRI.getRegBankOrNull(RHS)->getID() == AMDGPU::SGPRRegBankID);
+  const RegisterBank &RB = *MRI.getRegBankOrNull(Dst);
+
+  LLT S32 = LLT::scalar(32);
+
+  B.setInstrAndDebugLoc(MI);
+  const unsigned ExtOpc = getExtOpcodeForPromotedOp(MI);
+  LHS = B.buildInstr(ExtOpc, {S32}, {LHS}).getReg(0);
+  RHS = B.buildInstr(ExtOpc, {S32}, {RHS}).getReg(0);
+
+  MRI.setRegBank(LHS, RB);
+  MRI.setRegBank(RHS, RB);
+
+  MachineInstr *NewInst;
+  if (IsSelectOrCmp)
+    NewInst = B.buildInstr(Opc, {Dst}, {MI.getOperand(1), LHS, RHS});
+  else
+    NewInst = B.buildInstr(Opc, {S32}, {LHS, RHS});
+
+  if (Opc != AMDGPU::G_ICMP) {
+    Register Dst32 = NewInst->getOperand(0).getReg();
+    MRI.setRegBank(Dst32, RB);
+    B.buildTrunc(Dst, Dst32);
+  }
+
+  switch (Opc) {
+  case AMDGPU::G_ADD:
+  case AMDGPU::G_SHL:
+    NewInst->setFlag(MachineInstr::NoUWrap);
+    NewInst->setFlag(MachineInstr::NoSWrap);
+    break;
+  case AMDGPU::G_SUB:
+    if (MI.getFlag(MachineInstr::NoUWrap))
+      NewInst->setFlag(MachineInstr::NoUWrap);
+    NewInst->setFlag(MachineInstr::NoSWrap);
+    break;
+  case AMDGPU::G_MUL:
+    NewInst->setFlag(MachineInstr::NoUWrap);
+    if (MI.getFlag(MachineInstr::NoUWrap))
+      NewInst->setFlag(MachineInstr::NoUWrap);
+    break;
+  }
+
+  MI.eraseFromParent();
+}
+
 void AMDGPURegBankCombinerImpl::applyClamp(MachineInstr &MI,
                                            Register &Reg) const {
   B.buildInstr(AMDGPU::G_AMDGPU_CLAMP, {MI.getOperand(0)}, {Reg},
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 1437f3d58b5e79..96a59acd751a62 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -894,6 +894,7 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
                        ISD::UADDO_CARRY,
                        ISD::SUB,
                        ISD::USUBO_CARRY,
+                       ISD::MUL,
                        ISD::FADD,
                        ISD::FSUB,
                        ISD::FDIV,
@@ -909,9 +910,17 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
                        ISD::UMIN,
                        ISD::UMAX,
                        ISD::SETCC,
+                       ISD::SELECT,
+                       ISD::SMIN,
+                       ISD::SMAX,
+                       ISD::UMIN,
+                       ISD::UMAX,
                        ISD::AND,
                        ISD::OR,
                        ISD::XOR,
+                       ISD::SHL,
+                       ISD::SRL,
+                       ISD::SRA,
                        ISD::FSHR,
                        ISD::SINT_TO_FP,
                        ISD::UINT_TO_FP,
@@ -1935,13 +1944,6 @@ bool SITargetLowering::isTypeDesirableForOp(unsigned Op, EVT VT) const {
     switch (Op) {
     case ISD::LOAD:
     case ISD::STORE:
-
-    // These operations are done with 32-bit instructions anyway.
-    case ISD::AND:
-    case ISD::OR:
-    case ISD::XOR:
-    case ISD::SELECT:
-      // TODO: Extensions?
       return true;
     default:
       return false;
@@ -6746,6 +6748,122 @@ SDValue SITargetLowering::lowerFLDEXP(SDValue Op, SelectionDAG &DAG) const {
   return DAG.getNode(ISD::FLDEXP, DL, VT, Op.getOperand(0), TruncExp);
 }
 
+static unsigned getExtOpcodeForPromotedOp(SDValue Op) {
+  switch (Op->getOpcode()) {
+  case ISD::SRA:
+  case ISD::SMIN:
+  case ISD::SMAX:
+    return ISD::SIGN_EXTEND;
+  case ISD::ADD:
+  case ISD::SUB:
+  case ISD::SRL:
+  case ISD::UMIN:
+  case ISD::UMAX:
+    return ISD::ZERO_EXTEND;
+  case ISD::AND:
+  case ISD::OR:
+  case ISD::XOR:
+  case ISD::SHL:
+  case ISD::SELECT:
+  case ISD::MUL:
+    // operation result won't be influenced by garbage high bits.
+    // TODO: are all of those cases correct, and are there more?
+    return ISD::ANY_EXTEND;
+  case ISD::SETCC: {
+    ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(2))->get();
+    return ISD::isSignedIntSetCC(CC) ? ISD::SIGN_EXTEND : ISD::ZERO_EXTEND;
+  }
+  default:
+    llvm_unreachable("unexpected opcode!");
+  }
+}
+
+SDValue SITargetLowering::promoteUniformOpToI32(SDValue Op,
+                                                DAGCombinerInfo &DCI) const {
+  const unsigned Opc = Op.getOpcode();
+  assert(Opc == ISD::ADD || Opc == ISD::SUB || Opc == ISD::SHL ||
+         Opc == ISD::SRL || Opc == ISD::SRA || Opc == ISD::AND ||
+         Opc == ISD::OR || Opc == ISD::XOR || Opc == ISD::MUL ||
+         Opc == ISD::SETCC || Opc == ISD::SELECT || Opc == ISD::SMIN ||
+         Opc == ISD::SMAX || Opc == ISD::UMIN || Opc == ISD::UMAX);
+
+  EVT OpTy = (Opc != ISD::SETCC) ? Op.getValueType()
+                                 : Op->getOperand(0).getValueType();
+
+  if (DCI.isBeforeLegalizeOps())
+    return SDValue();
+
+  // Promote only if:
+  //    - We have 16 bit insts (not true 16 bit insts).
+  //    - We don't have packed instructions (for vector types only).
+  // TODO: For vector types, the set of packed operations is more limited, so
+  // may want to promote some anyway.
+  if (!Subtarget->has16BitInsts() ||
+      (OpTy.isVector() ? Subtarget->hasVOP3PInsts() : false))
+    return SDValue();
+
+  // Promote uniform scalar and vector integers between 2 and 16 bits.
+  if (Op->isDivergent() || !OpTy.isInteger() ||
+      OpTy.getScalarSizeInBits() == 1 || OpTy.getScalarSizeInBits() > 16)
+    return SDValue();
+
+  auto &DAG = DCI.DAG;
+
+  SDLoc DL(Op);
+  SDValue LHS;
+  SDValue RHS;
+  if (Opc == ISD::SELECT) {
+    LHS = Op->getOperand(1);
+    RHS = Op->getOperand(2);
+  } else {
+    LHS = Op->getOperand(0)...
[truncated]

llvmbot · 2024-08-28T12:55:30Z

@llvm/pr-subscribers-llvm-globalisel

Author: Pierre van Houtryve (Pierre-vh)

Changes

See #106382 for NFC test updates.

Promote uniform binops, selects and setcc in Global & DAGISel instead of CGP.

Solves #64591

Patch is 1.35 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106383.diff

88 Files Affected:

(modified) llvm/include/llvm/CodeGen/TargetLowering.h (+1-1)
(modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+10-9)
(modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+6-4)
(modified) llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp (+4-4)
(modified) llvm/lib/Target/AMDGPU/AMDGPUCombine.td (+27-1)
(modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+33-2)
(modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h (+1-1)
(modified) llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp (+113)
(modified) llvm/lib/Target/AMDGPU/SIISelLowering.cpp (+149-7)
(modified) llvm/lib/Target/AMDGPU/SIISelLowering.h (+1-1)
(modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+2-1)
(modified) llvm/lib/Target/X86/X86ISelLowering.h (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/add.v2i16.ll (+33-37)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/andn2.ll (+60-54)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll (+100-63)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fshl.ll (+72-48)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/fshr.ll (+78-52)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.div.fmas.ll (+442-412)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll (+107-42)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/mul.ll (+15-62)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/orn2.ll (+60-54)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sext_inreg.ll (+68-101)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll (+6-4)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll (+49-39)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/sub.v2i16.ll (+25-29)
(modified) llvm/test/CodeGen/AMDGPU/GlobalISel/xnor.ll (+4-22)
(modified) llvm/test/CodeGen/AMDGPU/add.v2i16.ll (+11-11)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll (+3-4)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-i16-to-i32.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-pow-codegen.ll (+2-650)
(modified) llvm/test/CodeGen/AMDGPU/amdgpu.private-memory.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/anyext.ll (+2-6)
(modified) llvm/test/CodeGen/AMDGPU/bitreverse.ll (+2-5)
(modified) llvm/test/CodeGen/AMDGPU/branch-folding-implicit-def-subreg.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/bug-sdag-emitcopyfromreg.ll (+2-62)
(modified) llvm/test/CodeGen/AMDGPU/calling-conventions.ll (+900-839)
(modified) llvm/test/CodeGen/AMDGPU/cgp-bitfield-extract.ll (+4-7)
(modified) llvm/test/CodeGen/AMDGPU/copy-illegal-type.ll (+1-1)
(modified) llvm/test/CodeGen/AMDGPU/ctlz.ll (+5-21)
(modified) llvm/test/CodeGen/AMDGPU/ctlz_zero_undef.ll (+14-11)
(modified) llvm/test/CodeGen/AMDGPU/cttz.ll (+3-11)
(modified) llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll (+14-24)
(modified) llvm/test/CodeGen/AMDGPU/dagcombine-select.ll (+2-3)
(modified) llvm/test/CodeGen/AMDGPU/extract_vector_dynelt.ll (+1010-309)
(modified) llvm/test/CodeGen/AMDGPU/extract_vector_elt-i8.ll (+539-119)
(modified) llvm/test/CodeGen/AMDGPU/fcopysign.f16.ll (+51-50)
(modified) llvm/test/CodeGen/AMDGPU/fneg.ll (+3-10)
(modified) llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll (-532)
(modified) llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll (+21-12)
(modified) llvm/test/CodeGen/AMDGPU/idiv-licm.ll (+235-228)
(modified) llvm/test/CodeGen/AMDGPU/imm16.ll (+9-9)
(modified) llvm/test/CodeGen/AMDGPU/insert-delay-alu-bug.ll (+57-49)
(modified) llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll (+910-993)
(modified) llvm/test/CodeGen/AMDGPU/insert_vector_elt.ll (+74-86)
(modified) llvm/test/CodeGen/AMDGPU/llvm.is.fpclass.bf16.ll (+20-20)
(modified) llvm/test/CodeGen/AMDGPU/load-constant-i1.ll (+3212-3431)
(modified) llvm/test/CodeGen/AMDGPU/load-constant-i8.ll (+2431-2404)
(modified) llvm/test/CodeGen/AMDGPU/load-global-i8.ll (+5-10)
(modified) llvm/test/CodeGen/AMDGPU/load-local-i8.ll (+5-10)
(modified) llvm/test/CodeGen/AMDGPU/lower-lds-struct-aa-memcpy.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/lshr.v2i16.ll (+7-7)
(modified) llvm/test/CodeGen/AMDGPU/min.ll (+225-170)
(modified) llvm/test/CodeGen/AMDGPU/mul.ll (+27-24)
(modified) llvm/test/CodeGen/AMDGPU/permute_i8.ll (+44-29)
(modified) llvm/test/CodeGen/AMDGPU/preload-kernargs.ll (+108-119)
(modified) llvm/test/CodeGen/AMDGPU/scalar_to_vector.ll (+21-14)
(modified) llvm/test/CodeGen/AMDGPU/sdwa-peephole.ll (+100-99)
(modified) llvm/test/CodeGen/AMDGPU/select-i1.ll (+4-9)
(modified) llvm/test/CodeGen/AMDGPU/select-vectors.ll (+2-3)
(modified) llvm/test/CodeGen/AMDGPU/setcc-opt.ll (+5-12)
(modified) llvm/test/CodeGen/AMDGPU/sext-in-reg.ll (+4-10)
(modified) llvm/test/CodeGen/AMDGPU/shl.ll (+3-2)
(modified) llvm/test/CodeGen/AMDGPU/shl.v2i16.ll (+3-3)
(modified) llvm/test/CodeGen/AMDGPU/sign_extend.ll (+9-10)
(modified) llvm/test/CodeGen/AMDGPU/smed3.ll (+17-3)
(modified) llvm/test/CodeGen/AMDGPU/sminmax.v2i16.ll (+1013-83)
(modified) llvm/test/CodeGen/AMDGPU/sra.ll (+40-40)
(modified) llvm/test/CodeGen/AMDGPU/srem.ll (+19-17)
(modified) llvm/test/CodeGen/AMDGPU/sub.v2i16.ll (+16-18)
(modified) llvm/test/CodeGen/AMDGPU/trunc-combine.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/trunc-store.ll (+80-56)
(modified) llvm/test/CodeGen/AMDGPU/uaddo.ll (+9-6)
(modified) llvm/test/CodeGen/AMDGPU/usubo.ll (+9-6)
(modified) llvm/test/CodeGen/AMDGPU/v_sat_pk_u8_i16.ll (+11-10)
(modified) llvm/test/CodeGen/AMDGPU/vector-alloca-bitcast.ll (+1-2)
(modified) llvm/test/CodeGen/AMDGPU/vgpr-spill-placement-issue61083.ll (+4-2)
(modified) llvm/test/CodeGen/AMDGPU/widen-smrd-loads.ll (+35-26)
(modified) llvm/test/CodeGen/AMDGPU/zero_extend.ll (+6-5)

diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h b/llvm/include/llvm/CodeGen/TargetLowering.h
index eda38cd8a564d6..85310a4911b8ed 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -3299,7 +3299,7 @@ class TargetLoweringBase {
   /// Return true if it's profitable to narrow operations of type SrcVT to
   /// DestVT. e.g. on x86, it's profitable to narrow from i32 to i8 but not from
   /// i32 to i16.
-  virtual bool isNarrowingProfitable(EVT SrcVT, EVT DestVT) const {
+  virtual bool isNarrowingProfitable(SDNode *N, EVT SrcVT, EVT DestVT) const {
     return false;
   }
 
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index b0a906743f29ff..513ad392cb360a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -7031,7 +7031,7 @@ SDValue DAGCombiner::visitAND(SDNode *N) {
     if (N1C->getAPIntValue().countLeadingZeros() >= (BitWidth - SrcBitWidth) &&
         TLI.isTruncateFree(VT, SrcVT) && TLI.isZExtFree(SrcVT, VT) &&
         TLI.isTypeDesirableForOp(ISD::AND, SrcVT) &&
-        TLI.isNarrowingProfitable(VT, SrcVT))
+        TLI.isNarrowingProfitable(N, VT, SrcVT))
       return DAG.getNode(ISD::ZERO_EXTEND, DL, VT,
                          DAG.getNode(ISD::AND, DL, SrcVT, N0Op0,
                                      DAG.getZExtOrTrunc(N1, DL, SrcVT)));
@@ -14574,7 +14574,7 @@ SDValue DAGCombiner::reduceLoadWidth(SDNode *N) {
   // ShLeftAmt will indicate how much a narrowed load should be shifted left.
   unsigned ShLeftAmt = 0;
   if (ShAmt == 0 && N0.getOpcode() == ISD::SHL && N0.hasOneUse() &&
-      ExtVT == VT && TLI.isNarrowingProfitable(N0.getValueType(), VT)) {
+      ExtVT == VT && TLI.isNarrowingProfitable(N, N0.getValueType(), VT)) {
     if (ConstantSDNode *N01 = dyn_cast<ConstantSDNode>(N0.getOperand(1))) {
       ShLeftAmt = N01->getZExtValue();
       N0 = N0.getOperand(0);
@@ -15118,9 +15118,11 @@ SDValue DAGCombiner::visitTRUNCATE(SDNode *N) {
   }
 
   // trunc (select c, a, b) -> select c, (trunc a), (trunc b)
-  if (N0.getOpcode() == ISD::SELECT && N0.hasOneUse()) {
-    if ((!LegalOperations || TLI.isOperationLegal(ISD::SELECT, SrcVT)) &&
-        TLI.isTruncateFree(SrcVT, VT)) {
+  if (N0.getOpcode() == ISD::SELECT && N0.hasOneUse() &&
+      TLI.isTruncateFree(SrcVT, VT)) {
+    if (!LegalOperations ||
+        (TLI.isOperationLegal(ISD::SELECT, SrcVT) &&
+         TLI.isNarrowingProfitable(N0.getNode(), N0.getValueType(), VT))) {
       SDLoc SL(N0);
       SDValue Cond = N0.getOperand(0);
       SDValue TruncOp0 = DAG.getNode(ISD::TRUNCATE, SL, VT, N0.getOperand(1));
@@ -20061,10 +20063,9 @@ SDValue DAGCombiner::ReduceLoadOpStoreWidth(SDNode *N) {
     EVT NewVT = EVT::getIntegerVT(*DAG.getContext(), NewBW);
     // The narrowing should be profitable, the load/store operation should be
     // legal (or custom) and the store size should be equal to the NewVT width.
-    while (NewBW < BitWidth &&
-           (NewVT.getStoreSizeInBits() != NewBW ||
-            !TLI.isOperationLegalOrCustom(Opc, NewVT) ||
-            !TLI.isNarrowingProfitable(VT, NewVT))) {
+    while (NewBW < BitWidth && (NewVT.getStoreSizeInBits() != NewBW ||
+                                !TLI.isOperationLegalOrCustom(Opc, NewVT) ||
+                                !TLI.isNarrowingProfitable(N, VT, NewVT))) {
       NewBW = NextPowerOf2(NewBW);
       NewVT = EVT::getIntegerVT(*DAG.getContext(), NewBW);
     }
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 4e796289cff0a1..97e10b3551db1a 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -1841,7 +1841,7 @@ bool TargetLowering::SimplifyDemandedBits(
         for (unsigned SmallVTBits = llvm::bit_ceil(DemandedSize);
              SmallVTBits < BitWidth; SmallVTBits = NextPowerOf2(SmallVTBits)) {
           EVT SmallVT = EVT::getIntegerVT(*TLO.DAG.getContext(), SmallVTBits);
-          if (isNarrowingProfitable(VT, SmallVT) &&
+          if (isNarrowingProfitable(Op.getNode(), VT, SmallVT) &&
               isTypeDesirableForOp(ISD::SHL, SmallVT) &&
               isTruncateFree(VT, SmallVT) && isZExtFree(SmallVT, VT) &&
               (!TLO.LegalOperations() || isOperationLegal(ISD::SHL, SmallVT))) {
@@ -1865,7 +1865,7 @@ bool TargetLowering::SimplifyDemandedBits(
       if ((BitWidth % 2) == 0 && !VT.isVector() && ShAmt < HalfWidth &&
           DemandedBits.countLeadingOnes() >= HalfWidth) {
         EVT HalfVT = EVT::getIntegerVT(*TLO.DAG.getContext(), HalfWidth);
-        if (isNarrowingProfitable(VT, HalfVT) &&
+        if (isNarrowingProfitable(Op.getNode(), VT, HalfVT) &&
             isTypeDesirableForOp(ISD::SHL, HalfVT) &&
             isTruncateFree(VT, HalfVT) && isZExtFree(HalfVT, VT) &&
             (!TLO.LegalOperations() || isOperationLegal(ISD::SHL, HalfVT))) {
@@ -1984,7 +1984,7 @@ bool TargetLowering::SimplifyDemandedBits(
       if ((BitWidth % 2) == 0 && !VT.isVector()) {
         APInt HiBits = APInt::getHighBitsSet(BitWidth, BitWidth / 2);
         EVT HalfVT = EVT::getIntegerVT(*TLO.DAG.getContext(), BitWidth / 2);
-        if (isNarrowingProfitable(VT, HalfVT) &&
+        if (isNarrowingProfitable(Op.getNode(), VT, HalfVT) &&
             isTypeDesirableForOp(ISD::SRL, HalfVT) &&
             isTruncateFree(VT, HalfVT) && isZExtFree(HalfVT, VT) &&
             (!TLO.LegalOperations() || isOperationLegal(ISD::SRL, HalfVT)) &&
@@ -4762,9 +4762,11 @@ SDValue TargetLowering::SimplifySetCC(EVT VT, SDValue N0, SDValue N1,
       case ISD::SETULT:
       case ISD::SETULE: {
         EVT newVT = N0.getOperand(0).getValueType();
+        // FIXME: Should use isNarrowingProfitable.
         if (DCI.isBeforeLegalizeOps() ||
             (isOperationLegal(ISD::SETCC, newVT) &&
-             isCondCodeLegal(Cond, newVT.getSimpleVT()))) {
+             isCondCodeLegal(Cond, newVT.getSimpleVT()) &&
+             isTypeDesirableForOp(ISD::SETCC, newVT))) {
           EVT NewSetCCVT = getSetCCResultType(Layout, *DAG.getContext(), newVT);
           SDValue NewConst = DAG.getConstant(C1.trunc(InSize), dl, newVT);
 
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
index 052e1140533f3f..f689fcf62fe8eb 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
@@ -46,10 +46,10 @@ static cl::opt<bool> WidenLoads(
   cl::init(false));
 
 static cl::opt<bool> Widen16BitOps(
-  "amdgpu-codegenprepare-widen-16-bit-ops",
-  cl::desc("Widen uniform 16-bit instructions to 32-bit in AMDGPUCodeGenPrepare"),
-  cl::ReallyHidden,
-  cl::init(true));
+    "amdgpu-codegenprepare-widen-16-bit-ops",
+    cl::desc(
+        "Widen uniform 16-bit instructions to 32-bit in AMDGPUCodeGenPrepare"),
+    cl::ReallyHidden, cl::init(false));
 
 static cl::opt<bool>
     BreakLargePHIs("amdgpu-codegenprepare-break-large-phis",
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
index b2a3f9392157d1..01e96159babd03 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCombine.td
@@ -145,6 +145,31 @@ def expand_promoted_fmed3 : GICombineRule<
 
 } // End Predicates = [NotHasMed3_16]
 
+def promote_i16_uniform_binops_frag : GICombinePatFrag<
+  (outs root:$dst), (ins),
+  !foreach(op, [G_ADD, G_SUB, G_SHL, G_ASHR, G_LSHR, G_AND, G_XOR, G_OR, G_MUL],
+          (pattern (op i16:$dst, i16:$lhs, i16:$rhs)))>;
+
+def promote_i16_uniform_binops : GICombineRule<
+  (defs root:$dst),
+  (match (promote_i16_uniform_binops_frag i16:$dst):$mi,
+    [{ return matchPromote16to32(*${mi}); }]),
+  (apply [{ applyPromote16to32(*${mi}); }])
+>;
+
+def promote_i16_uniform_ternary_frag : GICombinePatFrag<
+  (outs root:$dst), (ins),
+  !foreach(op, [G_ICMP, G_SELECT],
+          (pattern (op i16:$dst, $first, i16:$lhs, i16:$rhs)))>;
+
+def promote_i16_uniform_ternary : GICombineRule<
+  (defs root:$dst),
+  (match (promote_i16_uniform_ternary_frag i16:$dst):$mi,
+    [{ return matchPromote16to32(*${mi}); }]),
+  (apply [{ applyPromote16to32(*${mi}); }])
+>;
+
+
 // Combines which should only apply on SI/CI
 def gfx6gfx7_combines : GICombineGroup<[fcmp_select_to_fmin_fmax_legacy]>;
 
@@ -169,5 +194,6 @@ def AMDGPURegBankCombiner : GICombiner<
   "AMDGPURegBankCombinerImpl",
   [unmerge_merge, unmerge_cst, unmerge_undef,
    zext_trunc_fold, int_minmax_to_med3, ptr_add_immed_chain,
-   fp_minmax_to_clamp, fp_minmax_to_med3, fmed3_intrinsic_to_clamp]> {
+   fp_minmax_to_clamp, fp_minmax_to_med3, fmed3_intrinsic_to_clamp,
+   promote_i16_uniform_binops, promote_i16_uniform_ternary]> {
 }
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 96143d688801aa..1a596cc80c0c9c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -1017,14 +1017,45 @@ bool AMDGPUTargetLowering::isZExtFree(EVT Src, EVT Dest) const {
   return Src == MVT::i32 && Dest == MVT::i64;
 }
 
-bool AMDGPUTargetLowering::isNarrowingProfitable(EVT SrcVT, EVT DestVT) const {
+bool AMDGPUTargetLowering::isNarrowingProfitable(SDNode *N, EVT SrcVT,
+                                                 EVT DestVT) const {
+  switch (N->getOpcode()) {
+  case ISD::ADD:
+  case ISD::SUB:
+  case ISD::SHL:
+  case ISD::SRL:
+  case ISD::SRA:
+  case ISD::AND:
+  case ISD::OR:
+  case ISD::XOR:
+  case ISD::MUL:
+  case ISD::SETCC:
+  case ISD::SELECT:
+    if (Subtarget->has16BitInsts() &&
+        (DestVT.isVector() ? !Subtarget->hasVOP3PInsts() : true)) {
+      // Don't narrow back down to i16 if promoted to i32 already.
+      if (!N->isDivergent() && DestVT.isInteger() &&
+          DestVT.getScalarSizeInBits() > 1 &&
+          DestVT.getScalarSizeInBits() <= 16 &&
+          SrcVT.getScalarSizeInBits() > 16) {
+        return false;
+      }
+    }
+    return true;
+  default:
+    break;
+  }
+
   // There aren't really 64-bit registers, but pairs of 32-bit ones and only a
   // limited number of native 64-bit operations. Shrinking an operation to fit
   // in a single 32-bit register should always be helpful. As currently used,
   // this is much less general than the name suggests, and is only used in
   // places trying to reduce the sizes of loads. Shrinking loads to < 32-bits is
   // not profitable, and may actually be harmful.
-  return SrcVT.getSizeInBits() > 32 && DestVT.getSizeInBits() == 32;
+  if (isa<LoadSDNode>(N))
+    return SrcVT.getSizeInBits() > 32 && DestVT.getSizeInBits() == 32;
+
+  return true;
 }
 
 bool AMDGPUTargetLowering::isDesirableToCommuteWithShift(
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
index 59f640ea99de3e..4dfa7ac052a5ba 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.h
@@ -201,7 +201,7 @@ class AMDGPUTargetLowering : public TargetLowering {
                                NegatibleCost &Cost,
                                unsigned Depth) const override;
 
-  bool isNarrowingProfitable(EVT SrcVT, EVT DestVT) const override;
+  bool isNarrowingProfitable(SDNode *N, EVT SrcVT, EVT DestVT) const override;
 
   bool isDesirableToCommuteWithShift(const SDNode *N,
                                      CombineLevel Level) const override;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp b/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
index e236a5d7522e02..3b4faa35b93738 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
@@ -89,6 +89,9 @@ class AMDGPURegBankCombinerImpl : public Combiner {
   void applyMed3(MachineInstr &MI, Med3MatchInfo &MatchInfo) const;
   void applyClamp(MachineInstr &MI, Register &Reg) const;
 
+  bool matchPromote16to32(MachineInstr &MI) const;
+  void applyPromote16to32(MachineInstr &MI) const;
+
 private:
   SIModeRegisterDefaults getMode() const;
   bool getIEEE() const;
@@ -348,6 +351,116 @@ bool AMDGPURegBankCombinerImpl::matchFPMed3ToClamp(MachineInstr &MI,
   return false;
 }
 
+bool AMDGPURegBankCombinerImpl::matchPromote16to32(MachineInstr &MI) const {
+  Register Dst = MI.getOperand(0).getReg();
+  LLT DstTy = MRI.getType(Dst);
+  const auto *RB = MRI.getRegBankOrNull(Dst);
+
+  // Only promote uniform instructions.
+  if (RB->getID() != AMDGPU::SGPRRegBankID)
+    return false;
+
+  // Promote only if:
+  //    - We have 16 bit insts (not true 16 bit insts).
+  //    - We don't have packed instructions (for vector types only).
+  // TODO: For vector types, the set of packed operations is more limited, so
+  // may want to promote some anyway.
+  return STI.has16BitInsts() &&
+         (DstTy.isVector() ? !STI.hasVOP3PInsts() : true);
+}
+
+static unsigned getExtOpcodeForPromotedOp(MachineInstr &MI) {
+  switch (MI.getOpcode()) {
+  case AMDGPU::G_ASHR:
+    return AMDGPU::G_SEXT;
+  case AMDGPU::G_ADD:
+  case AMDGPU::G_SUB:
+  case AMDGPU::G_FSHR:
+    return AMDGPU::G_ZEXT;
+  case AMDGPU::G_AND:
+  case AMDGPU::G_OR:
+  case AMDGPU::G_XOR:
+  case AMDGPU::G_SHL:
+  case AMDGPU::G_SELECT:
+  case AMDGPU::G_MUL:
+    // operation result won't be influenced by garbage high bits.
+    // TODO: are all of those cases correct, and are there more?
+    return AMDGPU::G_ANYEXT;
+  case AMDGPU::G_ICMP: {
+    return CmpInst::isSigned(cast<GICmp>(MI).getCond()) ? AMDGPU::G_SEXT
+                                                        : AMDGPU::G_ZEXT;
+  }
+  default:
+    llvm_unreachable("unexpected opcode!");
+  }
+}
+
+void AMDGPURegBankCombinerImpl::applyPromote16to32(MachineInstr &MI) const {
+  const unsigned Opc = MI.getOpcode();
+  assert(Opc == AMDGPU::G_ADD || Opc == AMDGPU::G_SUB || Opc == AMDGPU::G_SHL ||
+         Opc == AMDGPU::G_LSHR || Opc == AMDGPU::G_ASHR ||
+         Opc == AMDGPU::G_AND || Opc == AMDGPU::G_OR || Opc == AMDGPU::G_XOR ||
+         Opc == AMDGPU::G_MUL || Opc == AMDGPU::G_SELECT ||
+         Opc == AMDGPU::G_ICMP);
+
+  Register Dst = MI.getOperand(0).getReg();
+
+  bool IsSelectOrCmp = (Opc == AMDGPU::G_SELECT || Opc == AMDGPU::G_ICMP);
+  Register LHS = MI.getOperand(IsSelectOrCmp + 1).getReg();
+  Register RHS = MI.getOperand(IsSelectOrCmp + 2).getReg();
+
+  assert(MRI.getType(Dst) == LLT::scalar(16));
+  assert(MRI.getType(LHS) == LLT::scalar(16));
+  assert(MRI.getType(RHS) == LLT::scalar(16));
+
+  assert(MRI.getRegBankOrNull(Dst)->getID() == AMDGPU::SGPRRegBankID);
+  assert(MRI.getRegBankOrNull(LHS)->getID() == AMDGPU::SGPRRegBankID);
+  assert(MRI.getRegBankOrNull(RHS)->getID() == AMDGPU::SGPRRegBankID);
+  const RegisterBank &RB = *MRI.getRegBankOrNull(Dst);
+
+  LLT S32 = LLT::scalar(32);
+
+  B.setInstrAndDebugLoc(MI);
+  const unsigned ExtOpc = getExtOpcodeForPromotedOp(MI);
+  LHS = B.buildInstr(ExtOpc, {S32}, {LHS}).getReg(0);
+  RHS = B.buildInstr(ExtOpc, {S32}, {RHS}).getReg(0);
+
+  MRI.setRegBank(LHS, RB);
+  MRI.setRegBank(RHS, RB);
+
+  MachineInstr *NewInst;
+  if (IsSelectOrCmp)
+    NewInst = B.buildInstr(Opc, {Dst}, {MI.getOperand(1), LHS, RHS});
+  else
+    NewInst = B.buildInstr(Opc, {S32}, {LHS, RHS});
+
+  if (Opc != AMDGPU::G_ICMP) {
+    Register Dst32 = NewInst->getOperand(0).getReg();
+    MRI.setRegBank(Dst32, RB);
+    B.buildTrunc(Dst, Dst32);
+  }
+
+  switch (Opc) {
+  case AMDGPU::G_ADD:
+  case AMDGPU::G_SHL:
+    NewInst->setFlag(MachineInstr::NoUWrap);
+    NewInst->setFlag(MachineInstr::NoSWrap);
+    break;
+  case AMDGPU::G_SUB:
+    if (MI.getFlag(MachineInstr::NoUWrap))
+      NewInst->setFlag(MachineInstr::NoUWrap);
+    NewInst->setFlag(MachineInstr::NoSWrap);
+    break;
+  case AMDGPU::G_MUL:
+    NewInst->setFlag(MachineInstr::NoUWrap);
+    if (MI.getFlag(MachineInstr::NoUWrap))
+      NewInst->setFlag(MachineInstr::NoUWrap);
+    break;
+  }
+
+  MI.eraseFromParent();
+}
+
 void AMDGPURegBankCombinerImpl::applyClamp(MachineInstr &MI,
                                            Register &Reg) const {
   B.buildInstr(AMDGPU::G_AMDGPU_CLAMP, {MI.getOperand(0)}, {Reg},
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 1437f3d58b5e79..96a59acd751a62 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -894,6 +894,7 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
                        ISD::UADDO_CARRY,
                        ISD::SUB,
                        ISD::USUBO_CARRY,
+                       ISD::MUL,
                        ISD::FADD,
                        ISD::FSUB,
                        ISD::FDIV,
@@ -909,9 +910,17 @@ SITargetLowering::SITargetLowering(const TargetMachine &TM,
                        ISD::UMIN,
                        ISD::UMAX,
                        ISD::SETCC,
+                       ISD::SELECT,
+                       ISD::SMIN,
+                       ISD::SMAX,
+                       ISD::UMIN,
+                       ISD::UMAX,
                        ISD::AND,
                        ISD::OR,
                        ISD::XOR,
+                       ISD::SHL,
+                       ISD::SRL,
+                       ISD::SRA,
                        ISD::FSHR,
                        ISD::SINT_TO_FP,
                        ISD::UINT_TO_FP,
@@ -1935,13 +1944,6 @@ bool SITargetLowering::isTypeDesirableForOp(unsigned Op, EVT VT) const {
     switch (Op) {
     case ISD::LOAD:
     case ISD::STORE:
-
-    // These operations are done with 32-bit instructions anyway.
-    case ISD::AND:
-    case ISD::OR:
-    case ISD::XOR:
-    case ISD::SELECT:
-      // TODO: Extensions?
       return true;
     default:
       return false;
@@ -6746,6 +6748,122 @@ SDValue SITargetLowering::lowerFLDEXP(SDValue Op, SelectionDAG &DAG) const {
   return DAG.getNode(ISD::FLDEXP, DL, VT, Op.getOperand(0), TruncExp);
 }
 
+static unsigned getExtOpcodeForPromotedOp(SDValue Op) {
+  switch (Op->getOpcode()) {
+  case ISD::SRA:
+  case ISD::SMIN:
+  case ISD::SMAX:
+    return ISD::SIGN_EXTEND;
+  case ISD::ADD:
+  case ISD::SUB:
+  case ISD::SRL:
+  case ISD::UMIN:
+  case ISD::UMAX:
+    return ISD::ZERO_EXTEND;
+  case ISD::AND:
+  case ISD::OR:
+  case ISD::XOR:
+  case ISD::SHL:
+  case ISD::SELECT:
+  case ISD::MUL:
+    // operation result won't be influenced by garbage high bits.
+    // TODO: are all of those cases correct, and are there more?
+    return ISD::ANY_EXTEND;
+  case ISD::SETCC: {
+    ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(2))->get();
+    return ISD::isSignedIntSetCC(CC) ? ISD::SIGN_EXTEND : ISD::ZERO_EXTEND;
+  }
+  default:
+    llvm_unreachable("unexpected opcode!");
+  }
+}
+
+SDValue SITargetLowering::promoteUniformOpToI32(SDValue Op,
+                                                DAGCombinerInfo &DCI) const {
+  const unsigned Opc = Op.getOpcode();
+  assert(Opc == ISD::ADD || Opc == ISD::SUB || Opc == ISD::SHL ||
+         Opc == ISD::SRL || Opc == ISD::SRA || Opc == ISD::AND ||
+         Opc == ISD::OR || Opc == ISD::XOR || Opc == ISD::MUL ||
+         Opc == ISD::SETCC || Opc == ISD::SELECT || Opc == ISD::SMIN ||
+         Opc == ISD::SMAX || Opc == ISD::UMIN || Opc == ISD::UMAX);
+
+  EVT OpTy = (Opc != ISD::SETCC) ? Op.getValueType()
+                                 : Op->getOperand(0).getValueType();
+
+  if (DCI.isBeforeLegalizeOps())
+    return SDValue();
+
+  // Promote only if:
+  //    - We have 16 bit insts (not true 16 bit insts).
+  //    - We don't have packed instructions (for vector types only).
+  // TODO: For vector types, the set of packed operations is more limited, so
+  // may want to promote some anyway.
+  if (!Subtarget->has16BitInsts() ||
+      (OpTy.isVector() ? Subtarget->hasVOP3PInsts() : false))
+    return SDValue();
+
+  // Promote uniform scalar and vector integers between 2 and 16 bits.
+  if (Op->isDivergent() || !OpTy.isInteger() ||
+      OpTy.getScalarSizeInBits() == 1 || OpTy.getScalarSizeInBits() > 16)
+    return SDValue();
+
+  auto &DAG = DCI.DAG;
+
+  SDLoc DL(Op);
+  SDValue LHS;
+  SDValue RHS;
+  if (Opc == ISD::SELECT) {
+    LHS = Op->getOperand(1);
+    RHS = Op->getOperand(2);
+  } else {
+    LHS = Op->getOperand(0)...
[truncated]

llvm/lib/Target/AMDGPU/AMDGPUCombine.td

llvm/lib/Target/AMDGPU/SIISelLowering.cpp

Please only review the last commit, see llvm#106383 for DAGIsel changes. GlobalISel counterpart of llvm#106383 See #llvm#64591

jayfoad

Some of my comments on #106557 also apply here.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

Promote uniform binops, selects and setcc in Global & DAGISel instead of CGP. Solves llvm#64591

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

llvm/lib/Target/AMDGPU/SIISelLowering.cpp

llvm-ci · 2024-09-19T07:09:20Z

LLVM Buildbot has detected a new failure on builder openmp-offload-sles-build-only running on rocm-worker-hw-04-sles while building llvm at step 8 "Add check check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/140/builds/6913

Here is the relevant piece of the build log for the reference

Step 8 (Add check check-llvm) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck -check-prefix=GFX6 /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck -check-prefix=GFX6 /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck -check-prefix=GFX8 /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/FileCheck -check-prefix=GFX8 /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
/home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/botworker/bbot/builds/openmp-offload-sles-build/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T07:11:55Z

LLVM Buildbot has detected a new failure on builder clang-hip-vega20 running on hip-vega20-0 while building llvm at step 3 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/123/builds/5908

Here is the relevant piece of the build log for the reference

Step 3 (annotate) failure: '../llvm-zorg/zorg/buildbot/builders/annotated/hip-build.sh --jobs=' (failure)
...
[38/40] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG  External/HIP/CMakeFiles/memmove-hip-6.0.2.dir/memmove.hip.o -o External/HIP/memmove-hip-6.0.2  --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/memmove.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/memmove.reference_output-hip-6.0.2
[39/40] /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG  -O3 -DNDEBUG   -w -Werror=date-time --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /buildbot/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc
[40/40] : && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/clang++ -O3 -DNDEBUG  External/HIP/CMakeFiles/TheNextWeek-hip-6.0.2.dir/workload/ray-tracing/TheNextWeek/main.cc.o -o External/HIP/TheNextWeek-hip-6.0.2  --rocm-path=/buildbot/Externals/hip/rocm-6.0.2 --hip-link -rtlib=compiler-rt -unwindlib=libgcc -frtlib-add-rpath && cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /usr/local/bin/cmake -E create_symlink /buildbot/llvm-test-suite/External/HIP/TheNextWeek.reference_output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/TheNextWeek.reference_output-hip-6.0.2
+ build_step 'Testing HIP test-suite'
+ echo '@@@BUILD_STEP Testing HIP test-suite@@@'
@@@BUILD_STEP Testing HIP test-suite@@@
+ ninja -v check-hip-simple
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
-- Testing: 7 tests, 7 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (4 of 7)
******************** TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED ********************

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i'

Input 1:
Memory access fault by GPU node-1 (Agent handle: 0x561df895fac0) on address (nil). Reason: Page not present or supervisor privilege.
exit 134

Input 2:
image width = 1200 height = 675
block size = (16, 16) grid size = (75, 43)
Start rendering by GPU.
Done.
gpu.ppm and ref.ppm are the same.
exit 0

********************
/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 325.13s

Total Discovered Tests: 7
  Passed: 6 (85.71%)
  Failed: 1 (14.29%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
ninja: build stopped: subcommand failed.
Step 12 (Testing HIP test-suite) failure: Testing HIP test-suite (failure)
@@@BUILD_STEP Testing HIP test-suite@@@
+ ninja -v check-hip-simple
[0/1] cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
-- Testing: 7 tests, 7 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test (4 of 7)
******************** TEST 'test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test' FAILED ********************

/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/timeit-target --timeout 7200 --limit-core 0 --limit-cpu 7200 --limit-file-size 209715200 --limit-rss-size 838860800 --append-exitstatus --redirect-output /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out --redirect-input /dev/null --summary /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.time /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/InOneWeekend-hip-6.0.2
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP ; /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2

+ cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP
+ /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP/Output/InOneWeekend-hip-6.0.2.test.out InOneWeekend.reference_output-hip-6.0.2
/buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/tools/fpcmp-target: Comparison failed, textual difference between 'M' and 'i'

Input 1:
Memory access fault by GPU node-1 (Agent handle: 0x561df895fac0) on address (nil). Reason: Page not present or supervisor privilege.
exit 134

Input 2:
image width = 1200 height = 675
block size = (16, 16) grid size = (75, 43)
Start rendering by GPU.
Done.
gpu.ppm and ref.ppm are the same.
exit 0

********************
/usr/bin/strip: /bin/bash.stripped: Bad file descriptor
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  test-suite :: External/HIP/InOneWeekend-hip-6.0.2.test


Testing Time: 325.13s

Total Discovered Tests: 7
  Passed: 6 (85.71%)
  Failed: 1 (14.29%)
FAILED: External/HIP/CMakeFiles/check-hip-simple-hip-6.0.2 
cd /buildbot/hip-vega20-0/clang-hip-vega20/test-suite-build/External/HIP && /buildbot/hip-vega20-0/clang-hip-vega20/llvm/bin/llvm-lit -sv empty-hip-6.0.2.test with-fopenmp-hip-6.0.2.test saxpy-hip-6.0.2.test memmove-hip-6.0.2.test InOneWeekend-hip-6.0.2.test TheNextWeek-hip-6.0.2.test blender.test
ninja: build stopped: subcommand failed.
program finished with exit code 1
elapsedTime=605.823405

llvm-ci · 2024-09-19T07:16:48Z

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-gcc-ubuntu running on sie-linux-worker3 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/174/builds/5380

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/bin/FileCheck -check-prefix=GFX6 /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/bin/FileCheck -check-prefix=GFX6 /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
RUN: at line 3: /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/bin/FileCheck -check-prefix=GFX8 /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/bin/FileCheck -check-prefix=GFX8 /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
�[1m/home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: �[0m�[0;1;31merror: �[0m�[1mGFX8-NEXT: is not on the line after the previous match
�[0m; GFX8-NEXT: s_waitcnt vmcnt(0)
�[0;1;32m             ^
�[0m�[1m<stdin>:4279:2: �[0m�[0;1;30mnote: �[0m�[1m'next' match was here
�[0m s_waitcnt vmcnt(0)
�[0;1;32m ^
�[0m�[1m<stdin>:4228:64: �[0m�[0;1;30mnote: �[0m�[1mprevious match ended here
�[0m buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
�[0;1;32m                                                               ^
�[0m�[1m<stdin>:4229:1: �[0m�[0;1;30mnote: �[0m�[1mnon-matching line after previous match is here
�[0m buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
�[0;1;32m^
�[0m
Input file: <stdin>
Check file: /home/buildbot/buildbot-root/llvm-clang-x86_64-gcc-ubuntu/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
�[1m�[0m�[0;1;30m              1: �[0m�[1m�[0;1;46m .text �[0m
�[0;1;30m              2: �[0m�[1m�[0;1;46m .section .AMDGPU.config,"",@progbits �[0m
�[0;1;30m              3: �[0m�[1m�[0;1;46m .long 47176 �[0m
�[0;1;30m              4: �[0m�[1m�[0;1;46m .long 11469504 �[0m
�[0;1;30m              5: �[0m�[1m�[0;1;46m .long 47180 �[0m
�[0;1;30m              6: �[0m�[1m�[0;1;46m .long 5004 �[0m
�[0;1;30m              7: �[0m�[1m�[0;1;46m .long 47200 �[0m
�[0;1;30m              8: �[0m�[1m�[0;1;46m .long 0 �[0m
�[0;1;30m              9: �[0m�[1m�[0;1;46m .long 4 �[0m
�[0;1;30m             10: �[0m�[1m�[0;1;46m .long 0 �[0m
�[0;1;30m             11: �[0m�[1m�[0;1;46m .long 8 �[0m
�[0;1;30m             12: �[0m�[1m�[0;1;46m .long 0 �[0m
�[0;1;30m             13: �[0m�[1m�[0;1;46m .text �[0m
�[0;1;30m             14: �[0m�[1m�[0;1;46m .globl constant_load_i1 ; -- Begin function constant_load_i1 �[0m
�[0;1;30m             15: �[0m�[1m�[0;1;46m .p2align 8 �[0m
�[0;1;30m             16: �[0m�[1m�[0;1;46m .type constant_load_i1,@function �[0m
�[0;1;30m             17: �[0m�[1m�[0;1;46m�[0mconstant_load_i1:�[0;1;46m ; @constant_load_i1 �[0m
�[0;1;32mlabel:26'0       ^~~~~~~~~~~~~~~~~
�[0m�[0;1;32mlabel:26'1       ^~~~~~~~~~~~~~~~~
...

llvm-ci · 2024-09-19T07:17:23Z

LLVM Buildbot has detected a new failure on builder ml-opt-devrel-x86-64 running on ml-opt-devrel-x86-64-b1 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/175/builds/5468

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-devrel-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck -check-prefix=GFX6 /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/ml-opt-devrel-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck -check-prefix=GFX6 /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /b/ml-opt-devrel-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck -check-prefix=GFX8 /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/ml-opt-devrel-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck -check-prefix=GFX8 /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T07:17:52Z

LLVM Buildbot has detected a new failure on builder ml-opt-rel-x86-64 running on ml-opt-rel-x86-64-b1 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/185/builds/5450

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-rel-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/ml-opt-rel-x86-64-b1/build/bin/FileCheck -check-prefix=GFX6 /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/ml-opt-rel-x86-64-b1/build/bin/FileCheck -check-prefix=GFX6 /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/ml-opt-rel-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
RUN: at line 3: /b/ml-opt-rel-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/ml-opt-rel-x86-64-b1/build/bin/FileCheck -check-prefix=GFX8 /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/ml-opt-rel-x86-64-b1/build/bin/FileCheck -check-prefix=GFX8 /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/ml-opt-rel-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
/b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /b/ml-opt-rel-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T07:18:49Z

LLVM Buildbot has detected a new failure on builder ml-opt-dev-x86-64 running on ml-opt-dev-x86-64-b1 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/137/builds/5507

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-dev-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck -check-prefix=GFX6 /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/ml-opt-dev-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck -check-prefix=GFX6 /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /b/ml-opt-dev-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck -check-prefix=GFX8 /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/ml-opt-dev-x86-64-b1/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /b/ml-opt-dev-x86-64-b1/build/bin/FileCheck -check-prefix=GFX8 /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /b/ml-opt-dev-x86-64-b1/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T07:19:59Z

LLVM Buildbot has detected a new failure on builder openmp-offload-libc-amdgpu-runtime running on omp-vega20-1 while building llvm at step 8 "Add check check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/73/builds/5762

Here is the relevant piece of the build log for the reference

Step 8 (Add check check-llvm) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/FileCheck -check-prefix=GFX6 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/FileCheck -check-prefix=GFX6 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/FileCheck -check-prefix=GFX8 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.build/bin/FileCheck -check-prefix=GFX8 /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/ompworker/bbot/openmp-offload-libc-amdgpu-runtime/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

Fix failure caused by #106383

Pierre-vh · 2024-09-19T07:24:31Z

da1a222

llvm-ci · 2024-09-19T07:25:06Z

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/9376

Here is the relevant piece of the build log for the reference

Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /build/buildbot/premerge-monolithic-linux/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck -check-prefix=GFX6 /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /build/buildbot/premerge-monolithic-linux/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck -check-prefix=GFX6 /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /build/buildbot/premerge-monolithic-linux/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck -check-prefix=GFX8 /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /build/buildbot/premerge-monolithic-linux/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck -check-prefix=GFX8 /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T07:26:32Z

LLVM Buildbot has detected a new failure on builder clang-ppc64le-linux-test-suite running on ppc64le-clang-test-suite while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/95/builds/4003

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/bin/FileCheck -check-prefix=GFX6 /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/bin/FileCheck -check-prefix=GFX6 /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/bin/FileCheck -check-prefix=GFX8 /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/build/bin/FileCheck -check-prefix=GFX8 /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-test-suite/clang-ppc64le-test-suite/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T07:29:04Z

LLVM Buildbot has detected a new failure on builder clang-debian-cpp20 running on clang-debian-cpp20 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/108/builds/3894

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/FileCheck -check-prefix=GFX6 /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/FileCheck -check-prefix=GFX6 /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/FileCheck -check-prefix=GFX8 /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/FileCheck -check-prefix=GFX8 /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T07:29:46Z

LLVM Buildbot has detected a new failure on builder ppc64le-lld-multistage-test running on ppc64le-lld-multistage-test while building llvm at step 7 "test-build-stage1-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/168/builds/3495

Here is the relevant piece of the build log for the reference

Step 7 (test-build-stage1-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/FileCheck -check-prefix=GFX6 /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/FileCheck -check-prefix=GFX6 /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/FileCheck -check-prefix=GFX8 /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage1/bin/FileCheck -check-prefix=GFX8 /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...
Step 13 (test-build-stage2-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/bin/FileCheck -check-prefix=GFX6 /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/bin/FileCheck -check-prefix=GFX6 /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/bin/FileCheck -check-prefix=GFX8 /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/bin/FileCheck -check-prefix=GFX8 /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T07:55:43Z

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-ubuntu running on as-builder-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/187/builds/1287

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -check-prefix=GFX6 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -check-prefix=GFX6 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -check-prefix=GFX8 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -check-prefix=GFX8 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T08:04:09Z

LLVM Buildbot has detected a new failure on builder lld-x86_64-ubuntu-fast running on as-builder-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/33/builds/3284

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck -check-prefix=GFX6 /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck -check-prefix=GFX6 /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck -check-prefix=GFX8 /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck -check-prefix=GFX8 /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T08:46:12Z

LLVM Buildbot has detected a new failure on builder clang-aarch64-sve-vls-2stage running on linaro-g3-04 while building llvm at step 12 "ninja check 2".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/4/builds/2253

Here is the relevant piece of the build log for the reference

Step 12 (ninja check 2) failure: stage 2 checked (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/bin/FileCheck -check-prefix=GFX6 /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/bin/FileCheck -check-prefix=GFX6 /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/bin/FileCheck -check-prefix=GFX8 /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2/bin/FileCheck -check-prefix=GFX8 /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/llvm/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T10:35:46Z

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-bootstrap-ubsan running on sanitizer-buildbot4 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/25/builds/2594

Here is the relevant piece of the build log for the reference

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85713 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.
FAIL: LLVM :: CodeGen/AMDGPU/load-constant-i1.ll (31446 of 85713)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck -check-prefix=GFX6 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck -check-prefix=GFX6 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck -check-prefix=GFX8 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck -check-prefix=GFX8 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
/home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
Step 10 (stage2/ubsan check) failure: stage2/ubsan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85713 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.
FAIL: LLVM :: CodeGen/AMDGPU/load-constant-i1.ll (31446 of 85713)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck -check-prefix=GFX6 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck -check-prefix=GFX6 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck -check-prefix=GFX8 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/FileCheck -check-prefix=GFX8 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build_ubsan/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
/home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
Step 13 (stage3/ubsan check) failure: stage3/ubsan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 82866 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.
FAIL: LLVM :: CodeGen/AMDGPU/load-constant-i1.ll (31446 of 82866)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/FileCheck -check-prefix=GFX6 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/FileCheck -check-prefix=GFX6 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
RUN: at line 3: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/FileCheck -check-prefix=GFX8 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm_build2_ubsan/bin/FileCheck -check-prefix=GFX8 /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-bootstrap-ubsan/build/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14

llvm-ci · 2024-09-19T14:19:04Z

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/7778

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck -check-prefix=GFX6 /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck -check-prefix=GFX6 /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck -check-prefix=GFX8 /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck -check-prefix=GFX8 /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
/b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

llvm-ci · 2024-09-19T14:25:51Z

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/5621

Here is the relevant piece of the build log for the reference

Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck -check-prefix=GFX6 /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck -check-prefix=GFX6 /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck -check-prefix=GFX8 /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck -check-prefix=GFX8 /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

Promote uniform binops, selects and setcc between 2 and 16 bits to 32 bits in DAGISel Solves llvm#64591

Fix failure caused by llvm#106383

llvm-ci · 2024-09-20T04:07:20Z

LLVM Buildbot has detected a new failure on builder llvm-x86_64-debian-dylib running on gribozavr4 while building llvm at step 7 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/60/builds/8010

Here is the relevant piece of the build log for the reference

Step 7 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/load-constant-i1.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-x86_64-debian-dylib/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs < /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck -check-prefix=GFX6 /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/1/llvm-x86_64-debian-dylib/build/bin/llc -mtriple=amdgcn-- -verify-machineinstrs
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck -check-prefix=GFX6 /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
RUN: at line 3: /b/1/llvm-x86_64-debian-dylib/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs < /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck -check-prefix=GFX8 /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck -check-prefix=GFX8 /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+ /b/1/llvm-x86_64-debian-dylib/build/bin/llc -mtriple=amdgcn-- -mcpu=tonga -verify-machineinstrs
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll:8972:14: error: GFX8-NEXT: is not on the line after the previous match
; GFX8-NEXT: s_waitcnt vmcnt(0)
             ^
<stdin>:4279:2: note: 'next' match was here
 s_waitcnt vmcnt(0)
 ^
<stdin>:4228:64: note: previous match ended here
 buffer_store_dword v12, off, s[88:91], 0 ; 4-byte Folded Spill
                                                               ^
<stdin>:4229:1: note: non-matching line after previous match is here
 buffer_store_dword v13, off, s[88:91], 0 offset:4 ; 4-byte Folded Spill
^

Input file: <stdin>
Check file: /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/CodeGen/AMDGPU/load-constant-i1.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
        4274:  v_mov_b32_e32 v2, s10 
        4275:  v_mov_b32_e32 v3, s11 
        4276:  v_mov_b32_e32 v9, s13 
        4277:  v_mov_b32_e32 v10, s14 
        4278:  v_mov_b32_e32 v11, s15 
        4279:  s_waitcnt vmcnt(0) 
next:8972      !~~~~~~~~~~~~~~~~~  error: match on wrong line
        4280:  flat_store_dwordx4 v[18:19], v[28:31] 
        4281:  flat_store_dwordx4 v[59:60], v[32:35] 
        4282:  flat_store_dwordx4 v[61:62], v[36:39] 
        4283:  flat_store_dwordx4 v[45:46], v[40:43] 
        4284:  flat_store_dwordx4 v[12:13], v[4:7] 
           .
           .
           .
>>>>>>
...

arsenm · 2024-09-20T04:26:05Z

llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-pow-codegen.ll

@@ -18,189 +18,33 @@ declare hidden half @_Z4pownDhi(half, i32)
 ; --------------------------------------------------------------------

 define half @test_pow_fast_f16(half %x, half %y) {
-; CHECK-LABEL: test_pow_fast_f16:
-; CHECK:       ; %bb.0:


This file lost all the test checks

I fixed this in 528bcf3

GlobalISel counterpart of llvm#106383 See #llvm#64591

jayfoad · 2024-09-23T10:00:01Z

llvm/lib/Target/AMDGPU/SIISelLowering.cpp

+  if (Op.getOpcode() == ISD::SRA || Op.getOpcode() == ISD::SRL ||
+      Op.getOpcode() == ISD::SRA)


Suggested change

if (Op.getOpcode() == ISD::SRA || Op.getOpcode() == ISD::SRL ||

Op.getOpcode() == ISD::SRA)

if (Op.getOpcode() == ISD::SHL || Op.getOpcode() == ISD::SRL ||

Op.getOpcode() == ISD::SRA)

@Pierre-vh ping - this looks like it was a simple typo?

Oops sorry, I'll fix it right now

Pierre-vh requested review from jayfoad, arsenm and jmmartinez August 28, 2024 12:54

llvmbot added backend:AMDGPU backend:X86 llvm:globalisel llvm:SelectionDAG SelectionDAGISel as well labels Aug 28, 2024

Pierre-vh commented Aug 28, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUCombine.td Outdated Show resolved Hide resolved

arsenm reviewed Aug 28, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/SIISelLowering.cpp Show resolved Hide resolved

Pierre-vh force-pushed the i16-to-i32-in-isel branch 2 times, most recently from 3f23227 to dc9f16d Compare August 29, 2024 13:27

Pierre-vh changed the title ~~[AMDGPU] Promote uniform ops to I32 in ISel~~ [AMDGPU] Promote uniform ops to I32 in DAGISel Aug 29, 2024

Pierre-vh added a commit to Pierre-vh/llvm-project that referenced this pull request Aug 29, 2024

[AMDGPU] Promote uniform ops to i32 in GISel

14e3a98

Please only review the last commit, see llvm#106383 for DAGIsel changes. GlobalISel counterpart of llvm#106383 See #llvm#64591

Pierre-vh mentioned this pull request Aug 29, 2024

[AMDGPU] Promote uniform ops to i32 in GISel #106557

Closed

jayfoad reviewed Sep 2, 2024

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Outdated Show resolved Hide resolved

Pierre-vh requested review from arsenm and jayfoad September 5, 2024 09:19

Pierre-vh added 3 commits September 9, 2024 11:49

[AMDGPU] Promote uniform ops to I32 in ISel

04fad5d

Promote uniform binops, selects and setcc in Global & DAGISel instead of CGP. Solves llvm#64591

Address coments

26f54ef

rebase

f9feb66

Pierre-vh force-pushed the i16-to-i32-in-isel branch from 6aa6e20 to f9feb66 Compare September 9, 2024 10:04

arsenm reviewed Sep 15, 2024

View reviewed changes

llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Show resolved Hide resolved

llvm/lib/Target/AMDGPU/SIISelLowering.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/AMDGPU/SIISelLowering.cpp Outdated Show resolved Hide resolved

Comments

19317ad

Pierre-vh requested a review from arsenm September 17, 2024 05:27

arsenm approved these changes Sep 18, 2024

View reviewed changes

Pierre-vh merged commit 758444c into llvm:main Sep 19, 2024
8 checks passed

Pierre-vh deleted the i16-to-i32-in-isel branch September 19, 2024 07:00

Pierre-vh added a commit that referenced this pull request Sep 19, 2024

[AMDGPU] Regenerate load-constant-i1 test

da1a222

Fix failure caused by #106383

tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024

[AMDGPU] Promote uniform ops to I32 in DAGISel (llvm#106383)

4d83011

Promote uniform binops, selects and setcc between 2 and 16 bits to 32 bits in DAGISel Solves llvm#64591

tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024

[AMDGPU] Regenerate load-constant-i1 test

5e29e0d

Fix failure caused by llvm#106383

arsenm reviewed Sep 20, 2024

View reviewed changes

Pierre-vh added a commit to Pierre-vh/llvm-project that referenced this pull request Sep 23, 2024

[AMDGPU] Promote uniform ops to i32 in GISel

bf5070f

GlobalISel counterpart of llvm#106383 See #llvm#64591

jayfoad reviewed Sep 23, 2024

View reviewed changes

		if (Op.getOpcode() == ISD::SRA \|\| Op.getOpcode() == ISD::SRL \|\|
		Op.getOpcode() == ISD::SRA)

[AMDGPU] Promote uniform ops to I32 in DAGISel #106383

[AMDGPU] Promote uniform ops to I32 in DAGISel #106383

Uh oh!

Conversation

Pierre-vh commented Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Aug 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Aug 28, 2024

Uh oh!

Uh oh!

Uh oh!

jayfoad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

Pierre-vh commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 19, 2024

Uh oh!

llvm-ci commented Sep 20, 2024

Uh oh!

arsenm Sep 20, 2024

Choose a reason for hiding this comment

Uh oh!

arsenm Sep 20, 2024

Choose a reason for hiding this comment

Uh oh!

jayfoad Sep 23, 2024

Choose a reason for hiding this comment

Uh oh!

jayfoad Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

Pierre-vh Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Pierre-vh commented Aug 28, 2024 •

edited

Loading

llvmbot commented Aug 28, 2024 •

edited

Loading