Skip to content

[RISCV] Add basic ISel patterns for Xqcisls instructions #135918

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 16, 2025

Conversation

svs-quic
Copy link
Contributor

This patch adds basic instruction selection patterns for generating the scaled load/store instructions that are a part of the Qualcomm uC Xqcisls vendor extension.

This patch adds basic instruction selection patterns for generating the scaled load/store
instructions that are a part of the Qualcomm uC Xqcisls vendor extension.
@llvmbot
Copy link
Member

llvmbot commented Apr 16, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Sudharsan Veeravalli (svs-quic)

Changes

This patch adds basic instruction selection patterns for generating the scaled load/store instructions that are a part of the Qualcomm uC Xqcisls vendor extension.


Full diff: https://github.com/llvm/llvm-project/pull/135918.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.td (+1-1)
  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td (+28)
  • (added) llvm/test/CodeGen/RISCV/xqcisls.ll (+207)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.td b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
index 1104d9089536f..baf2bae367df1 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.td
@@ -248,7 +248,7 @@ def InsnDirectiveOpcode : AsmOperandClass {
 
 def uimm1 : RISCVUImmLeafOp<1>;
 def uimm2 : RISCVUImmLeafOp<2>;
-def uimm3 : RISCVUImmOp<3>;
+def uimm3 : RISCVUImmLeafOp<3>;
 def uimm4 : RISCVUImmLeafOp<4>;
 def uimm5 : RISCVUImmLeafOp<5>;
 def uimm6 : RISCVUImmLeafOp<6>;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td b/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
index 6736b0f1d0328..2e5e16e781f50 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
@@ -164,6 +164,9 @@ def AddLike: PatFrags<(ops node:$A, node:$B),
     return CurDAG->isBaseWithConstantOffset(SDValue(N, 0));
 }]>;
 
+def AddShl : PatFrag<(ops node:$Ra, node:$Rb, node:$SH3),
+                     (add node:$Ra, (shl node:$Rb, node:$SH3))>;
+
 //===----------------------------------------------------------------------===//
 // Instruction Formats
 //===----------------------------------------------------------------------===//
@@ -1252,6 +1255,14 @@ class QC48StPat<PatFrag StoreOp, RVInst48 Inst>
     : Pat<(StoreOp (i32 GPR:$rs2), (AddLike (i32 GPR:$rs1), simm26_nosimm12:$imm26)),
           (Inst GPR:$rs2, GPR:$rs1, simm26_nosimm12:$imm26)>;
 
+class QCScaledLdPat<PatFrag LoadOp, RVInst Inst>
+    : Pat<(i32 (LoadOp (AddShl (i32 GPRMem:$rs1), (i32 GPRNoX0:$rs2), uimm3:$shamt))),
+          (Inst GPRMem:$rs1, GPRNoX0:$rs2, uimm3:$shamt)>;
+
+class QCScaledStPat<PatFrag StoreOp, RVInst Inst>
+    : Pat<(StoreOp (i32 GPR:$rd),(AddShl (i32 GPRMem:$rs1), (i32 GPRNoX0:$rs2), uimm3:$shamt)),
+          (Inst GPR:$rd, GPRMem:$rs1, GPRNoX0:$rs2, uimm3:$shamt)>;
+
 /// Simple arithmetic operations
 
 let Predicates = [HasVendorXqcilia, IsRV32] in {
@@ -1266,6 +1277,8 @@ def : PatGprNoX0Simm26NoSimm12<or, QC_E_ORI>;
 def : PatGprNoX0Simm26NoSimm12<xor, QC_E_XORI>;
 } // Predicates = [HasVendorXqcilia, IsRV32]
 
+/// Load/Store operations
+
 let Predicates = [HasVendorXqcilo, IsRV32], AddedComplexity = 2 in {
   def : QC48LdPat<sextloadi8, QC_E_LB>;
   def : QC48LdPat<extloadi8, QC_E_LBU>; // Prefer unsigned due to no c.lb in Zcb.
@@ -1280,5 +1293,20 @@ let Predicates = [HasVendorXqcilo, IsRV32], AddedComplexity = 2 in {
   def : QC48StPat<store, QC_E_SW>;
 } // Predicates = [HasVendorXqcilo, IsRV32], AddedComplexity = 2
 
+
+let Predicates = [HasVendorXqcisls, IsRV32], AddedComplexity = 1 in {
+  def : QCScaledLdPat<sextloadi8, QC_LRB>;
+  def : QCScaledLdPat<extloadi8, QC_LRBU>;
+  def : QCScaledLdPat<sextloadi16, QC_LRH>;
+  def : QCScaledLdPat<extloadi16, QC_LRH>;
+  def : QCScaledLdPat<load, QC_LRW>;
+  def : QCScaledLdPat<zextloadi8, QC_LRBU>;
+  def : QCScaledLdPat<zextloadi16, QC_LRHU>;
+
+  def : QCScaledStPat<truncstorei8, QC_SRB>;
+  def : QCScaledStPat<truncstorei16, QC_SRH>;
+  def : QCScaledStPat<store, QC_SRW>;
+} // Predicates = [HasVendorXqcisls, IsRV32], AddedComplexity = 1
+
 let Predicates = [HasVendorXqciint, IsRV32] in
 def : Pat<(riscv_mileaveret_glue), (QC_C_MILEAVERET)>;
diff --git a/llvm/test/CodeGen/RISCV/xqcisls.ll b/llvm/test/CodeGen/RISCV/xqcisls.ll
new file mode 100644
index 0000000000000..b9263d487b60f
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/xqcisls.ll
@@ -0,0 +1,207 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV32I
+; RUN: llc -mtriple=riscv32 --mattr=+zba -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV32IZBA
+; RUN: llc -mtriple=riscv32 -mattr=+zba,+experimental-xqcisls -verify-machineinstrs < %s \
+; RUN:   | FileCheck %s -check-prefixes=RV32IZBAXQCISLS
+
+define i32 @lb_ri(i8* %a, i32 %b) {
+; RV32I-LABEL: lb_ri:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a1, 3
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    lb a0, 0(a0)
+; RV32I-NEXT:    ret
+;
+; RV32IZBA-LABEL: lb_ri:
+; RV32IZBA:       # %bb.0:
+; RV32IZBA-NEXT:    sh3add a0, a1, a0
+; RV32IZBA-NEXT:    lb a0, 0(a0)
+; RV32IZBA-NEXT:    ret
+;
+; RV32IZBAXQCISLS-LABEL: lb_ri:
+; RV32IZBAXQCISLS:       # %bb.0:
+; RV32IZBAXQCISLS-NEXT:    qc.lrb a0, a0, a1, 3
+; RV32IZBAXQCISLS-NEXT:    ret
+  %shl = shl i32 %b, 3
+  %1 = getelementptr i8, i8* %a, i32 %shl
+  %2 = load i8, i8* %1
+  %3 = sext i8 %2 to i32
+  ret i32 %3
+}
+
+define i32 @lbu_ri(i8* %a, i32 %b) {
+; RV32I-LABEL: lbu_ri:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a1, 2
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    lbu a0, 0(a0)
+; RV32I-NEXT:    ret
+;
+; RV32IZBA-LABEL: lbu_ri:
+; RV32IZBA:       # %bb.0:
+; RV32IZBA-NEXT:    sh2add a0, a1, a0
+; RV32IZBA-NEXT:    lbu a0, 0(a0)
+; RV32IZBA-NEXT:    ret
+;
+; RV32IZBAXQCISLS-LABEL: lbu_ri:
+; RV32IZBAXQCISLS:       # %bb.0:
+; RV32IZBAXQCISLS-NEXT:    qc.lrbu a0, a0, a1, 2
+; RV32IZBAXQCISLS-NEXT:    ret
+  %shl = shl i32 %b, 2
+  %1 = getelementptr i8, i8* %a, i32 %shl
+  %2 = load i8, i8* %1
+  %3 = zext i8 %2 to i32
+  ret i32 %3
+}
+
+define i32 @lh_ri(i16* %a, i32 %b) {
+; RV32I-LABEL: lh_ri:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a1, 5
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    lh a0, 0(a0)
+; RV32I-NEXT:    ret
+;
+; RV32IZBA-LABEL: lh_ri:
+; RV32IZBA:       # %bb.0:
+; RV32IZBA-NEXT:    slli a1, a1, 5
+; RV32IZBA-NEXT:    add a0, a0, a1
+; RV32IZBA-NEXT:    lh a0, 0(a0)
+; RV32IZBA-NEXT:    ret
+;
+; RV32IZBAXQCISLS-LABEL: lh_ri:
+; RV32IZBAXQCISLS:       # %bb.0:
+; RV32IZBAXQCISLS-NEXT:    qc.lrh a0, a0, a1, 5
+; RV32IZBAXQCISLS-NEXT:    ret
+  %shl = shl i32 %b, 4
+  %1 = getelementptr i16, i16* %a, i32 %shl
+  %2 = load i16, i16* %1
+  %3 = sext i16 %2 to i32
+  ret i32 %3
+}
+
+define i32 @lhu_ri(i16* %a, i32 %b) {
+; RV32I-LABEL: lhu_ri:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a1, 6
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    lhu a0, 0(a0)
+; RV32I-NEXT:    ret
+;
+; RV32IZBA-LABEL: lhu_ri:
+; RV32IZBA:       # %bb.0:
+; RV32IZBA-NEXT:    slli a1, a1, 6
+; RV32IZBA-NEXT:    add a0, a0, a1
+; RV32IZBA-NEXT:    lhu a0, 0(a0)
+; RV32IZBA-NEXT:    ret
+;
+; RV32IZBAXQCISLS-LABEL: lhu_ri:
+; RV32IZBAXQCISLS:       # %bb.0:
+; RV32IZBAXQCISLS-NEXT:    qc.lrhu a0, a0, a1, 6
+; RV32IZBAXQCISLS-NEXT:    ret
+  %shl = shl i32 %b, 5
+  %1 = getelementptr i16, i16* %a, i32 %shl
+  %2 = load i16, i16* %1
+  %3 = zext i16 %2 to i32
+  ret i32 %3
+}
+
+define i32 @lw_ri(i32* %a, i32 %b) {
+; RV32I-LABEL: lw_ri:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a1, a1, 6
+; RV32I-NEXT:    add a0, a0, a1
+; RV32I-NEXT:    lw a0, 0(a0)
+; RV32I-NEXT:    ret
+;
+; RV32IZBA-LABEL: lw_ri:
+; RV32IZBA:       # %bb.0:
+; RV32IZBA-NEXT:    slli a1, a1, 6
+; RV32IZBA-NEXT:    add a0, a0, a1
+; RV32IZBA-NEXT:    lw a0, 0(a0)
+; RV32IZBA-NEXT:    ret
+;
+; RV32IZBAXQCISLS-LABEL: lw_ri:
+; RV32IZBAXQCISLS:       # %bb.0:
+; RV32IZBAXQCISLS-NEXT:    qc.lrw a0, a0, a1, 6
+; RV32IZBAXQCISLS-NEXT:    ret
+  %shl = shl i32 %b, 4
+  %1 = getelementptr i32, i32* %a, i32 %shl
+  %2 = load i32, i32* %1
+  ret i32 %2
+}
+
+define void @sb_ri(i8* %a, i8 %b, i32 %c) {
+; RV32I-LABEL: sb_ri:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a2, a2, 7
+; RV32I-NEXT:    add a0, a0, a2
+; RV32I-NEXT:    sb a1, 0(a0)
+; RV32I-NEXT:    ret
+;
+; RV32IZBA-LABEL: sb_ri:
+; RV32IZBA:       # %bb.0:
+; RV32IZBA-NEXT:    slli a2, a2, 7
+; RV32IZBA-NEXT:    add a0, a0, a2
+; RV32IZBA-NEXT:    sb a1, 0(a0)
+; RV32IZBA-NEXT:    ret
+;
+; RV32IZBAXQCISLS-LABEL: sb_ri:
+; RV32IZBAXQCISLS:       # %bb.0:
+; RV32IZBAXQCISLS-NEXT:    qc.srb a1, a0, a2, 7
+; RV32IZBAXQCISLS-NEXT:    ret
+  %shl = shl i32 %c, 7
+  %1 = getelementptr i8, i8* %a, i32 %shl
+  store i8 %b, i8* %1
+  ret void
+}
+
+define void @sh_ri(i16* %a, i16 %b, i32 %c) {
+; RV32I-LABEL: sh_ri:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a2, a2, 3
+; RV32I-NEXT:    add a0, a0, a2
+; RV32I-NEXT:    sh a1, 0(a0)
+; RV32I-NEXT:    ret
+;
+; RV32IZBA-LABEL: sh_ri:
+; RV32IZBA:       # %bb.0:
+; RV32IZBA-NEXT:    sh3add a0, a2, a0
+; RV32IZBA-NEXT:    sh a1, 0(a0)
+; RV32IZBA-NEXT:    ret
+;
+; RV32IZBAXQCISLS-LABEL: sh_ri:
+; RV32IZBAXQCISLS:       # %bb.0:
+; RV32IZBAXQCISLS-NEXT:    qc.srh a1, a0, a2, 3
+; RV32IZBAXQCISLS-NEXT:    ret
+  %shl = shl i32 %c, 2
+  %1 = getelementptr i16, i16* %a, i32 %shl
+  store i16 %b, i16* %1
+  ret void
+}
+
+define void @sw_ri(i32* %a, i32 %b, i32 %c) {
+; RV32I-LABEL: sw_ri:
+; RV32I:       # %bb.0:
+; RV32I-NEXT:    slli a2, a2, 3
+; RV32I-NEXT:    add a0, a0, a2
+; RV32I-NEXT:    sw a1, 0(a0)
+; RV32I-NEXT:    ret
+;
+; RV32IZBA-LABEL: sw_ri:
+; RV32IZBA:       # %bb.0:
+; RV32IZBA-NEXT:    sh3add a0, a2, a0
+; RV32IZBA-NEXT:    sw a1, 0(a0)
+; RV32IZBA-NEXT:    ret
+;
+; RV32IZBAXQCISLS-LABEL: sw_ri:
+; RV32IZBAXQCISLS:       # %bb.0:
+; RV32IZBAXQCISLS-NEXT:    qc.srw a1, a0, a2, 3
+; RV32IZBAXQCISLS-NEXT:    ret
+  %shl = shl i32 %c, 1
+  %1 = getelementptr i32, i32* %a, i32 %shl
+  store i32 %b, i32* %1
+  ret void
+}

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@svs-quic
Copy link
Contributor Author

Failures are unrelated. Merging this.

@svs-quic svs-quic merged commit 11857be into llvm:main Apr 16, 2025
9 of 11 checks passed
@svs-quic svs-quic deleted the xqcislspat branch April 16, 2025 07:41
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
This patch adds basic instruction selection patterns for generating the
scaled load/store instructions that are a part of the Qualcomm uC
Xqcisls vendor extension.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants