Skip to content

[RISCV] Porting hasAllNBitUsers to RISCV GISel for instruction select #124678

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

lquinn2015
Copy link
Contributor

@lquinn2015 lquinn2015 commented Jan 28, 2025

Ported hasAllNBitUsers to RISCV GISel side. Add GISelPredicate code to each of the 16,32, and 64 bit words. It allows for generation of optimized packw sequences along with other transparent narrowing operations. Included a few new .ll files to expand testing and limited the OptW pass Optimization to fewer options until GISel is ready for more code generation paths

@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-risc-v

Author: Luke Quinn (lquinn2015)

Changes

Ported hasAllNBitUsers to RISCV GISel side. Add GISelPredicate code to each of the 16,32, and 64 bit words. It allows for generation of optimized packw sequences along with other transparent narrowing operations.


Patch is 21.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/124678.diff

8 Files Affected:

  • (modified) llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp (+167)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+6-2)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/combine.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb-zbkb.ll (+6-6)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll (+38-38)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll (+4-11)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll (+1-1)
  • (modified) llvm/test/CodeGen/RISCV/GlobalISel/vararg.ll (+2-2)
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp b/llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp
index 3f1539da4a9c84..69cf1fd7653a6c 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVInstructionSelector.cpp
@@ -57,6 +57,12 @@ class RISCVInstructionSelector : public InstructionSelector {
   const TargetRegisterClass *
   getRegClassForTypeOnBank(LLT Ty, const RegisterBank &RB) const;
 
+  // const MachineInstr &MI
+  bool hasAllNBitUsers(const MachineInstr &MI, unsigned Bits, const unsigned Depth = 0) const;
+  bool hasAllBUsers(const MachineInstr &MI) const { return hasAllNBitUsers(MI, 8); }
+  bool hasAllHUsers(const MachineInstr &MI) const { return hasAllNBitUsers(MI, 16); }
+  bool hasAllWUsers(const MachineInstr &MI) const { return hasAllNBitUsers(MI, 32); }
+
   bool isRegInGprb(Register Reg) const;
   bool isRegInFprb(Register Reg) const;
 
@@ -186,6 +192,167 @@ RISCVInstructionSelector::RISCVInstructionSelector(
 {
 }
 
+bool RISCVInstructionSelector::hasAllNBitUsers(const MachineInstr &MI, unsigned Bits, const unsigned Depth) const {
+
+  assert((MI.getOpcode() == TargetOpcode::G_ADD ||
+          MI.getOpcode() == TargetOpcode::G_SUB ||
+          MI.getOpcode() == TargetOpcode::G_MUL ||
+          MI.getOpcode() == TargetOpcode::G_SHL ||
+          MI.getOpcode() == TargetOpcode::G_LSHR ||
+          MI.getOpcode() == TargetOpcode::G_AND ||
+          MI.getOpcode() == TargetOpcode::G_OR ||
+          MI.getOpcode() == TargetOpcode::G_XOR ||
+          MI.getOpcode() == TargetOpcode::G_SEXT_INREG || Depth != 0) &&
+         "Unexpected opcode");
+
+  if (Depth >= /*TODO*/ 20)
+    return false;
+
+  // Skip Vectors
+  // if(Depth == 0 && !MI.getOperand(0).isScalar())
+  //    return false;
+
+  for (MachineInstr &Use : MRI->use_instructions(MI.getOperand(0).getReg())) {
+
+    switch (Use.getOpcode()) {
+    default:
+      // if (vectorPseudoHasAllNBitUsers(User, Use.getNumOperands(), Bits, TII))
+      //   break;
+      return false;
+    case RISCV::ADDW:
+    case RISCV::ADDIW:
+    case RISCV::SUBW:
+    case RISCV::MULW:
+    case RISCV::SLLW:
+    case RISCV::SLLIW:
+    case RISCV::SRAW:
+    case RISCV::SRAIW:
+    case RISCV::SRLW:
+    case RISCV::SRLIW:
+    case RISCV::DIVW:
+    case RISCV::DIVUW:
+    case RISCV::REMW:
+    case RISCV::REMUW:
+    case RISCV::ROLW:
+    case RISCV::RORW:
+    case RISCV::RORIW:
+    case RISCV::CLZW:
+    case RISCV::CTZW:
+    case RISCV::CPOPW:
+    case RISCV::SLLI_UW:
+    case RISCV::FMV_W_X:
+    case RISCV::FCVT_H_W:
+    case RISCV::FCVT_H_W_INX:
+    case RISCV::FCVT_H_WU:
+    case RISCV::FCVT_H_WU_INX:
+    case RISCV::FCVT_S_W:
+    case RISCV::FCVT_S_W_INX:
+    case RISCV::FCVT_S_WU:
+    case RISCV::FCVT_S_WU_INX:
+    case RISCV::FCVT_D_W:
+    case RISCV::FCVT_D_W_INX:
+    case RISCV::FCVT_D_WU:
+    case RISCV::FCVT_D_WU_INX:
+    case RISCV::TH_REVW:
+    case RISCV::TH_SRRIW:
+      if (Bits >= 32)
+        break;
+      return false;
+    case RISCV::SLL:
+    case RISCV::SRA:
+    case RISCV::SRL:
+    case RISCV::ROL:
+    case RISCV::ROR:
+    case RISCV::BSET:
+    case RISCV::BCLR:
+    case RISCV::BINV:
+      // Shift amount operands only use log2(Xlen) bits.
+      if (Use.getNumOperands() == 1 && Bits >= Log2_32(Subtarget->getXLen()))
+        break;
+      return false;
+    case RISCV::SLLI:
+      // SLLI only uses the lower (XLen - ShAmt) bits.
+      if (Bits >= Subtarget->getXLen() - Use.getOperand(2).getImm())
+        break;
+      return false;
+    case RISCV::ANDI:
+      if (Bits >= (unsigned)llvm::bit_width<uint64_t>(
+                      ~((uint64_t)Use.getOperand(2).getImm())))
+        break;
+      goto RecCheck;
+    case RISCV::ORI: {
+      uint64_t Imm = Use.getOperand(2).getImm();
+      if (Bits >= (unsigned)llvm::bit_width<uint64_t>(~Imm))
+        break;
+      [[fallthrough]];
+    }
+    case RISCV::AND:
+    case RISCV::OR:
+    case RISCV::XOR:
+    case RISCV::XORI:
+    case RISCV::ANDN:
+    case RISCV::ORN:
+    case RISCV::XNOR:
+    case RISCV::SH1ADD:
+    case RISCV::SH2ADD:
+    case RISCV::SH3ADD:
+    RecCheck:
+      if (hasAllNBitUsers(Use, Bits, Depth + 1))
+        break;
+      return false;
+    case RISCV::SRLI: {
+      unsigned ShAmt = Use.getOperand(2).getImm();
+      // If we are shifting right by less than Bits, and users don't demand any
+      // bits that were shifted into [Bits-1:0], then we can consider this as an
+      // N-Bit user.
+      if (Bits > ShAmt && hasAllNBitUsers(Use, Bits - ShAmt, Depth + 1))
+        break;
+      return false;
+    }
+    case RISCV::SEXT_B:
+    case RISCV::PACKH:
+      if (Bits >= 8)
+        break;
+      return false;
+    case RISCV::SEXT_H:
+    case RISCV::FMV_H_X:
+    case RISCV::ZEXT_H_RV32:
+    case RISCV::ZEXT_H_RV64:
+    case RISCV::PACKW:
+      if (Bits >= 16)
+        break;
+      return false;
+    case RISCV::PACK:
+      if (Bits >= (Subtarget->getXLen() / 2))
+        break;
+      return false;
+    case RISCV::ADD_UW:
+    case RISCV::SH1ADD_UW:
+    case RISCV::SH2ADD_UW:
+    case RISCV::SH3ADD_UW:
+      // The first operand to add.uw/shXadd.uw is implicitly zero extended from
+      // 32 bits.
+      if (Use.getNumOperands() == 0 && Bits >= 32)
+        break;
+      return false;
+    case RISCV::SB:
+      if (Use.getNumOperands() == 0 && Bits >= 8)
+        break;
+      return false;
+    case RISCV::SH:
+      if (Use.getNumOperands() == 0 && Bits >= 16)
+        break;
+      return false;
+    case RISCV::SW:
+      if (Use.getNumOperands() == 0 && Bits >= 32)
+        break;
+      return false;
+    }
+  }
+
+  return true;
+}
+
 InstructionSelector::ComplexRendererFns
 RISCVInstructionSelector::selectShiftMask(MachineOperand &Root,
                                           unsigned ShiftWidth) const {
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index bb5bb6352c32a5..fbfc354daa2f29 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -1945,7 +1945,9 @@ class binop_allhusers<SDPatternOperator operator>
     : PatFrag<(ops node:$lhs, node:$rhs),
               (XLenVT (operator node:$lhs, node:$rhs)), [{
   return hasAllHUsers(Node);
-}]>;
+}]> {
+    let GISelPredicateCode = [{ return hasAllHUsers(MI); }];
+}
 
 // PatFrag to allow ADDW/SUBW/MULW/SLLW to be selected from i64 add/sub/mul/shl
 // if only the lower 32 bits of their result is used.
@@ -1953,7 +1955,9 @@ class binop_allwusers<SDPatternOperator operator>
     : PatFrag<(ops node:$lhs, node:$rhs),
               (i64 (operator node:$lhs, node:$rhs)), [{
   return hasAllWUsers(Node);
-}]>;
+}]> {
+  let GISelPredicateCode = [{ return hasAllWUsers(MI); }];
+}
 
 def sexti32_allwusers : PatFrag<(ops node:$src),
                                 (sext_inreg node:$src, i32), [{
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/combine.ll b/llvm/test/CodeGen/RISCV/GlobalISel/combine.ll
index 360e84d37ec858..61d1fa5a5b9f4b 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/combine.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/combine.ll
@@ -20,7 +20,7 @@ define i32 @constant_to_rhs(i32 %x) {
 ; RV64-O0:       # %bb.0:
 ; RV64-O0-NEXT:    mv a1, a0
 ; RV64-O0-NEXT:    li a0, 1
-; RV64-O0-NEXT:    add a0, a0, a1
+; RV64-O0-NEXT:    addw a0, a0, a1
 ; RV64-O0-NEXT:    sext.w a0, a0
 ; RV64-O0-NEXT:    ret
 ;
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb-zbkb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb-zbkb.ll
index a29219bfde06bb..c4847effc3ce89 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb-zbkb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb-zbkb.ll
@@ -107,7 +107,7 @@ declare i32 @llvm.fshl.i32(i32, i32, i32)
 define signext i32 @rol_i32(i32 signext %a, i32 signext %b) nounwind {
 ; RV64I-LABEL: rol_i32:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    neg a2, a1
+; RV64I-NEXT:    negw a2, a1
 ; RV64I-NEXT:    sllw a1, a0, a1
 ; RV64I-NEXT:    srlw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -125,7 +125,7 @@ define signext i32 @rol_i32(i32 signext %a, i32 signext %b) nounwind {
 define void @rol_i32_nosext(i32 signext %a, i32 signext %b, ptr %x) nounwind {
 ; RV64I-LABEL: rol_i32_nosext:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    neg a3, a1
+; RV64I-NEXT:    negw a3, a1
 ; RV64I-NEXT:    sllw a1, a0, a1
 ; RV64I-NEXT:    srlw a0, a0, a3
 ; RV64I-NEXT:    or a0, a1, a0
@@ -146,7 +146,7 @@ define signext i32 @rol_i32_neg_constant_rhs(i32 signext %a) nounwind {
 ; RV64I-LABEL: rol_i32_neg_constant_rhs:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    li a1, -2
-; RV64I-NEXT:    neg a2, a0
+; RV64I-NEXT:    negw a2, a0
 ; RV64I-NEXT:    sllw a0, a1, a0
 ; RV64I-NEXT:    srlw a1, a1, a2
 ; RV64I-NEXT:    or a0, a0, a1
@@ -185,7 +185,7 @@ declare i32 @llvm.fshr.i32(i32, i32, i32)
 define signext i32 @ror_i32(i32 signext %a, i32 signext %b) nounwind {
 ; RV64I-LABEL: ror_i32:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    neg a2, a1
+; RV64I-NEXT:    negw a2, a1
 ; RV64I-NEXT:    srlw a1, a0, a1
 ; RV64I-NEXT:    sllw a0, a0, a2
 ; RV64I-NEXT:    or a0, a1, a0
@@ -203,7 +203,7 @@ define signext i32 @ror_i32(i32 signext %a, i32 signext %b) nounwind {
 define void @ror_i32_nosext(i32 signext %a, i32 signext %b, ptr %x) nounwind {
 ; RV64I-LABEL: ror_i32_nosext:
 ; RV64I:       # %bb.0:
-; RV64I-NEXT:    neg a3, a1
+; RV64I-NEXT:    negw a3, a1
 ; RV64I-NEXT:    srlw a1, a0, a1
 ; RV64I-NEXT:    sllw a0, a0, a3
 ; RV64I-NEXT:    or a0, a1, a0
@@ -224,7 +224,7 @@ define signext i32 @ror_i32_neg_constant_rhs(i32 signext %a) nounwind {
 ; RV64I-LABEL: ror_i32_neg_constant_rhs:
 ; RV64I:       # %bb.0:
 ; RV64I-NEXT:    li a1, -2
-; RV64I-NEXT:    neg a2, a0
+; RV64I-NEXT:    negw a2, a0
 ; RV64I-NEXT:    srlw a0, a1, a0
 ; RV64I-NEXT:    sllw a1, a1, a2
 ; RV64I-NEXT:    or a0, a0, a1
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
index 9df319e73a11a3..9a6c718703a27a 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbb.ll
@@ -31,13 +31,13 @@ define signext i32 @ctlz_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -88,13 +88,13 @@ define signext i32 @log2_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -103,7 +103,7 @@ define signext i32 @log2_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    call __muldi3
 ; RV64I-NEXT:    srliw a0, a0, 24
 ; RV64I-NEXT:    li a1, 32
-; RV64I-NEXT:    sub a0, a1, a0
+; RV64I-NEXT:    subw a0, a1, a0
 ; RV64I-NEXT:    j .LBB1_3
 ; RV64I-NEXT:  .LBB1_2:
 ; RV64I-NEXT:    li a0, 32
@@ -153,13 +153,13 @@ define signext i32 @log2_ceil_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -168,7 +168,7 @@ define signext i32 @log2_ceil_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    call __muldi3
 ; RV64I-NEXT:    srliw a0, a0, 24
 ; RV64I-NEXT:    li a1, 32
-; RV64I-NEXT:    sub a1, a1, a0
+; RV64I-NEXT:    subw a1, a1, a0
 ; RV64I-NEXT:  .LBB2_2: # %cond.end
 ; RV64I-NEXT:    subw a0, s0, a1
 ; RV64I-NEXT:    ld ra, 8(sp) # 8-byte Folded Reload
@@ -212,13 +212,13 @@ define signext i32 @findLastSet_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -283,13 +283,13 @@ define i32 @ctlz_lshr_i32(i32 signext %a) {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -412,13 +412,13 @@ define signext i32 @cttz_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -455,13 +455,13 @@ define signext i32 @cttz_zero_undef_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -497,13 +497,13 @@ define signext i32 @findFirstSet_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -553,13 +553,13 @@ define signext i32 @ffs_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -672,13 +672,13 @@ define signext i32 @ctpop_i32(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -709,13 +709,13 @@ define i1 @ctpop_i32_ult_two(i32 signext %a) nounwind {
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
@@ -750,13 +750,13 @@ define signext i32 @ctpop_i32_load(ptr %p) nounwind {
 ; RV64I-NEXT:    and a1, a2, a1
 ; RV64I-NEXT:    lui a2, 209715
 ; RV64I-NEXT:    addi a2, a2, 819
-; RV64I-NEXT:    sub a0, a0, a1
+; RV64I-NEXT:    subw a0, a0, a1
 ; RV64I-NEXT:    srliw a1, a0, 2
 ; RV64I-NEXT:    and a0, a0, a2
 ; RV64I-NEXT:    and a1, a1, a2
 ; RV64I-NEXT:    lui a2, 61681
-; RV64I-NEXT:    addw a0, a1, a0
-; RV64I-NEXT:    srli a1, a0, 4
+; RV64I-NEXT:    add a0, a1, a0
+; RV64I-NEXT:    sraiw a1, a0, 4
 ; RV64I-NEXT:    addw a0, a1, a0
 ; RV64I-NEXT:    lui a1, 4112
 ; RV64I-NEXT:    addiw a2, a2, -241
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
index bf430c618afca2..558424b53be951 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/rv64zbkb.ll
@@ -16,9 +16,7 @@ define signext i32 @pack_i32(i32 signext %a, i32 signext %b) nounwind {
 ;
 ; RV64ZBKB-LABEL: pack_i32:
 ; RV64ZBKB:       # %bb.0:
-; RV64ZBKB-NEXT:    zext.h a0, a0
-; RV64ZBKB-NEXT:    slliw a1, a1, 16
-; RV64ZBKB-NEXT:    or a0, a1, a0
+; RV64ZBKB-NEXT:    packw a0, a0, a1
 ; RV64ZBKB-NEXT:    ret
   %shl = and i32 %a, 65535
   %shl1 = shl i32 %b, 16
@@ -37,9 +35,7 @@ define signext i32 @pack_i32_2(i16 zeroext %a, i16 zeroext %b) nounwind {
 ;
 ; RV64ZBKB-LABEL: pack_i32_2:
 ; RV64ZBKB:       # %bb.0:
-; RV64ZBKB-NEXT:    slli a1, a1, 16
-; RV64ZBKB-NEXT:    or a0, a1, a0
-; RV64ZBKB-NEXT:    sext.w a0, a0
+; RV64ZBKB-NEXT:    packw a0, a0, a1
 ; RV64ZBKB-NEXT:    ret
   %zexta = zext i16 %a to i32
   %zextb = zext i16 %b to i32
@@ -60,8 +56,7 @@ define signext i32 @pack_i32_3(i16 zeroext %0, i16 zeroext %1, i32 signext %2) {
 ;
 ; RV64ZBKB-LABEL: pack_i32_3:
 ; RV64ZBKB:       # %bb.0:
-; RV64ZBKB-NEXT:    slli a0, a0, 16
-; RV64ZBKB-NEXT:    or a0, a0, a1
+; RV64ZBKB-NEXT:    packw a0, a1, a0
 ; RV64ZBKB-NEXT:    addw a0, a0, a2
 ; RV64ZBKB-NEXT:    ret
   %4 = zext i16 %0 to i32
@@ -343,9 +338,7 @@ define signext i32 @pack_i32_allWUsers(i16 zeroext %0, i16 zeroext %1, i16 zeroe
 ; RV64ZBKB:       # %bb.0:
 ; RV64ZBKB-NEXT:    add a0, a1, a0
 ; RV64ZBKB-NEXT:    zext.h a0, a0
-; RV64ZBKB-NEXT:    slli a0, a0, 16
-; RV64ZBKB-NEXT:    or a0, a0, a2
-; RV64ZBKB-NEXT:    sext.w a0, a0
+; RV64ZBKB-NEXT:    packw a0, a2, a0
 ; RV64ZBKB-NEXT:    ret
   %4 = add i16 %1, %0
   %5 = zext i16 %4 to i32
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll b/llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll
index caa749729ce198..bb53a7f1422411 100644
--- a/llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/stacksave-stackrestore.ll
@@ -48,7 +48,7 @@ define void @test_scoped_alloca(i64 %n) {
 ; RV64-NEXT:    .cfi_offset s1, -24
 ; RV64-NEXT:  ...
[truncated]

@lquinn2015
Copy link
Contributor Author

I wanted 2 points of feed back here

  1. there is no Recursion limit available in GISel like there is in SelectionDag but I was unsure of where to put that in the CodeGen/GlobalIsel headers

  2. I am unsure how to check from a MachineOperand if it is a vector operation in GlobalISel I think I am missing a operation. Reguardless of doing that check the following code should default to failure for selected Vector type instructions since there opcodes are not enumerated. I think its better to stage out the Vector pseudo part as well.

Copy link

github-actions bot commented Jan 28, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

break;
return false;
case RISCV::SW:
if (Use.getNumOperands() == 0 && Bits >= 32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have test coverage for all of these instructions? I don't see any sw/sh/sb related diff below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not a enough testing from the Global Isel group. Some of the instructions are not yet legalizable i was trying to be faithful to the SelectionDag but maybe we need to be less ambitious however a decent chuck are actually checked. I am compiling a list i'll try and have the list later today that seem tested its difficult though because I think some of the tests were partially capture by the renderImm work Craig and I did so its less clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up going through the tests and am trying to come up with a better testing plan. In ISel we had 27 different .ll files of varying complex to test this i think it might actually take me a bit to generate a new test cases for every case statement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My latest push reduce the coverage of these operations to a minimal set it really is reduced but tested now

@lquinn2015 lquinn2015 force-pushed the dev/lquinn/binop_wusers branch from 3c4e266 to 0e97975 Compare January 29, 2025 14:37
…ypes for staging adding full support

Signed-off-by: Luke Quinn <[email protected]>
…generation of packw instructions along with other generic instructions with narrow w type.

Signed-off-by: Luke Quinn <[email protected]>
@lquinn2015 lquinn2015 force-pushed the dev/lquinn/binop_wusers branch from 0e97975 to 9cdd863 Compare February 3, 2025 14:01
@lquinn2015 lquinn2015 closed this Feb 5, 2025
@lenary
Copy link
Member

lenary commented Feb 5, 2025

Why was this closed?

@topperc
Copy link
Collaborator

topperc commented Feb 5, 2025

Why was this closed?

github wasn't refreshing the page to show the correct version of the patch. There was a status that said "Processing Update" for over day. A similar thing happened with several of my patches yesterday. There was a conversation in #backends on discord that suggested opening a new PR and linking to this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants