Skip to content

[GISEL][RISCV] RegBank Scalable Vector Load/Store #99932

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 31, 2024

Conversation

jiahanxie353
Copy link
Contributor

This patch supports GlobalISel for RegBank selection for Scalable Vector load and store instructions.

// Use FPR64 for s64 loads on rv32.
if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
if (GPRSize == 32 && Ty.getSizeInBits().getKnownMinValue() == 64 &&
!Ty.isVector()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not confident about this if statement

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check !Ty.isVector() before Ty.getSizeInBits() and drop the getKnownMinValue(). Scalar types should never be scalable.

// Use FPR64 for s64 loads on rv32.
if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
if (GPRSize == 32 && Ty.getSizeInBits().getKnownMinValue() == 64 &&
!Ty.isVector()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check !Ty.isVector() before Ty.getSizeInBits() and drop the getKnownMinValue(). Scalar types should never be scalable.

; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 2 x s8>)
; RV64I-NEXT: PseudoRET implicit $v8
%0:gprb(p0) = COPY $x10
%2:vrb(p0) = COPY %0(p0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why p0 is copied to vrb. Although p0 is a pointer to a scalable vector, which should be in vrb; p0 itself is not scalable. Just wondering..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  case TargetOpcode::G_LOAD: {
    LLT Ty = MRI.getType(MI.getOperand(0).getReg());
    TypeSize Size = Ty.getSizeInBits();
    if (Ty.isVector()) {
      OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue());
      OpdsMapping[1] = getVRBValueMapping(Size.getKnownMinValue());

You are setting the operand as vector type for the G_LOAD, which causes the copy to think it is a vector type, since we do analysis bottom up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks @michaelmaitland and @topperc !

// Use FPR64 for s64 loads on rv32.
if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
if (!Ty.isVector() && GPRSize == 32 &&
Ty.getSizeInBits().getKnownMinValue() == 64) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use getFixedValue() instead of getKnownMinValue(). It should not be scalable since we already checked !Ty.isVector().

// Use FPR64 for s64 stores on rv32.
if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
if (!Ty.isVector() && GPRSize == 32 &&
Ty.getSizeInBits().getKnownMinValue() == 64) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use getFixedValue()

TypeSize Size = Ty.getSizeInBits();
if (Ty.isVector()) {
OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue());
OpdsMapping[1] = getVRBValueMapping(Size.getKnownMinValue());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpsMapping[1] should always be GPRValueMapping

TypeSize Size = Ty.getSizeInBits();
if (Ty.isVector()) {
OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue());
OpdsMapping[1] = getVRBValueMapping(Size.getKnownMinValue());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpsMapping[1] should always be GPRValueMapping

OpdsMapping[1] = GPRValueMapping;
// Use FPR64 for s64 loads on rv32.
if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
if (!Ty.isVector() && GPRSize == 32 &&
Ty.getSizeInBits().getFixedValue() == 64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size.getFixedValue()

OpdsMapping[1] = GPRValueMapping;
// Use FPR64 for s64 stores on rv32.
if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
if (!Ty.isVector() && GPRSize == 32 &&
Ty.getSizeInBits().getFixedValue() == 64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size.getFixedValue()

Copy link
Contributor

@michaelmaitland michaelmaitland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@llvmbot
Copy link
Member

llvmbot commented Jul 30, 2024

@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-regalloc

Author: Jiahan Xie (jiahanxie353)

Changes

This patch supports GlobalISel for RegBank selection for Scalable Vector load and store instructions.


Patch is 137.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/99932.diff

4 Files Affected:

  • (modified) llvm/lib/CodeGen/RegisterBankInfo.cpp (+3-2)
  • (modified) llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp (+14-4)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/regbankselect/rvv/load.mir (+1698)
  • (added) llvm/test/CodeGen/RISCV/GlobalISel/regbankselect/rvv/store.mir (+1698)
diff --git a/llvm/lib/CodeGen/RegisterBankInfo.cpp b/llvm/lib/CodeGen/RegisterBankInfo.cpp
index 72b07eb1902d9..00dcc1fbcd0c7 100644
--- a/llvm/lib/CodeGen/RegisterBankInfo.cpp
+++ b/llvm/lib/CodeGen/RegisterBankInfo.cpp
@@ -215,8 +215,9 @@ RegisterBankInfo::getInstrMappingImpl(const MachineInstr &MI) const {
       }
     }
 
-    unsigned Size = getSizeInBits(Reg, MRI, TRI);
-    const ValueMapping *ValMapping = &getValueMapping(0, Size, *CurRegBank);
+    TypeSize Size = getSizeInBits(Reg, MRI, TRI);
+    const ValueMapping *ValMapping =
+        &getValueMapping(0, Size.getKnownMinValue(), *CurRegBank);
     if (IsCopyLike) {
       if (!OperandsMapping[0]) {
         if (MI.isRegSequence()) {
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
index 43bbc8589e7e2..f7279bbbd6488 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
@@ -310,10 +310,15 @@ RISCVRegisterBankInfo::getInstrMapping(const MachineInstr &MI) const {
   switch (Opc) {
   case TargetOpcode::G_LOAD: {
     LLT Ty = MRI.getType(MI.getOperand(0).getReg());
-    OpdsMapping[0] = GPRValueMapping;
+    TypeSize Size = Ty.getSizeInBits();
+    if (Ty.isVector())
+      OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue());
+    else
+      OpdsMapping[0] = GPRValueMapping;
+
     OpdsMapping[1] = GPRValueMapping;
     // Use FPR64 for s64 loads on rv32.
-    if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
+    if (!Ty.isVector() && GPRSize == 32 && Size.getFixedValue() == 64) {
       assert(MF.getSubtarget<RISCVSubtarget>().hasStdExtD());
       OpdsMapping[0] = getFPValueMapping(Ty.getSizeInBits());
       break;
@@ -333,10 +338,15 @@ RISCVRegisterBankInfo::getInstrMapping(const MachineInstr &MI) const {
   }
   case TargetOpcode::G_STORE: {
     LLT Ty = MRI.getType(MI.getOperand(0).getReg());
-    OpdsMapping[0] = GPRValueMapping;
+    TypeSize Size = Ty.getSizeInBits();
+    if (Ty.isVector())
+      OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue());
+    else
+      OpdsMapping[0] = GPRValueMapping;
+
     OpdsMapping[1] = GPRValueMapping;
     // Use FPR64 for s64 stores on rv32.
-    if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
+    if (!Ty.isVector() && GPRSize == 32 && Size.getFixedValue() == 64) {
       assert(MF.getSubtarget<RISCVSubtarget>().hasStdExtD());
       OpdsMapping[0] = getFPValueMapping(Ty.getSizeInBits());
       break;
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/regbankselect/rvv/load.mir b/llvm/test/CodeGen/RISCV/GlobalISel/regbankselect/rvv/load.mir
new file mode 100644
index 0000000000000..73ac2702cf9d4
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/regbankselect/rvv/load.mir
@@ -0,0 +1,1698 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=riscv32 -mattr=+m,+v -run-pass=regbankselect \
+# RUN:   -disable-gisel-legality-check -simplify-mir -verify-machineinstrs %s \
+# RUN:   -o - | FileCheck -check-prefix=RV32I %s
+# RUN: llc -mtriple=riscv64 -mattr=+m,+v -run-pass=regbankselect \
+# RUN:   -disable-gisel-legality-check -simplify-mir -verify-machineinstrs %s \
+# RUN:   -o - | FileCheck -check-prefix=RV64I %s
+--- |
+
+  define <vscale x 1 x i8> @vload_nx1i8(ptr %pa) #0 {
+    %va = load <vscale x 1 x i8>, ptr %pa, align 1
+    ret <vscale x 1 x i8> %va
+  }
+
+  define <vscale x 2 x i8> @vload_nx2i8(ptr %pa) #0 {
+    %va = load <vscale x 2 x i8>, ptr %pa, align 2
+    ret <vscale x 2 x i8> %va
+  }
+
+  define <vscale x 4 x i8> @vload_nx4i8(ptr %pa) #0 {
+    %va = load <vscale x 4 x i8>, ptr %pa, align 4
+    ret <vscale x 4 x i8> %va
+  }
+
+  define <vscale x 8 x i8> @vload_nx8i8(ptr %pa) #0 {
+    %va = load <vscale x 8 x i8>, ptr %pa, align 8
+    ret <vscale x 8 x i8> %va
+  }
+
+  define <vscale x 16 x i8> @vload_nx16i8(ptr %pa) #0 {
+    %va = load <vscale x 16 x i8>, ptr %pa, align 16
+    ret <vscale x 16 x i8> %va
+  }
+
+  define <vscale x 32 x i8> @vload_nx32i8(ptr %pa) #0 {
+    %va = load <vscale x 32 x i8>, ptr %pa, align 32
+    ret <vscale x 32 x i8> %va
+  }
+
+  define <vscale x 64 x i8> @vload_nx64i8(ptr %pa) #0 {
+    %va = load <vscale x 64 x i8>, ptr %pa, align 64
+    ret <vscale x 64 x i8> %va
+  }
+
+  define <vscale x 1 x i16> @vload_nx1i16(ptr %pa) #0 {
+    %va = load <vscale x 1 x i16>, ptr %pa, align 2
+    ret <vscale x 1 x i16> %va
+  }
+
+  define <vscale x 2 x i16> @vload_nx2i16(ptr %pa) #0 {
+    %va = load <vscale x 2 x i16>, ptr %pa, align 4
+    ret <vscale x 2 x i16> %va
+  }
+
+  define <vscale x 4 x i16> @vload_nx4i16(ptr %pa) #0 {
+    %va = load <vscale x 4 x i16>, ptr %pa, align 8
+    ret <vscale x 4 x i16> %va
+  }
+
+  define <vscale x 8 x i16> @vload_nx8i16(ptr %pa) #0 {
+    %va = load <vscale x 8 x i16>, ptr %pa, align 16
+    ret <vscale x 8 x i16> %va
+  }
+
+  define <vscale x 16 x i16> @vload_nx16i16(ptr %pa) #0 {
+    %va = load <vscale x 16 x i16>, ptr %pa, align 32
+    ret <vscale x 16 x i16> %va
+  }
+
+  define <vscale x 32 x i16> @vload_nx32i16(ptr %pa) #0 {
+    %va = load <vscale x 32 x i16>, ptr %pa, align 64
+    ret <vscale x 32 x i16> %va
+  }
+
+  define <vscale x 1 x i32> @vload_nx1i32(ptr %pa) #0 {
+    %va = load <vscale x 1 x i32>, ptr %pa, align 4
+    ret <vscale x 1 x i32> %va
+  }
+
+  define <vscale x 2 x i32> @vload_nx2i32(ptr %pa) #0 {
+    %va = load <vscale x 2 x i32>, ptr %pa, align 8
+    ret <vscale x 2 x i32> %va
+  }
+
+  define <vscale x 4 x i32> @vload_nx4i32(ptr %pa) #0 {
+    %va = load <vscale x 4 x i32>, ptr %pa, align 16
+    ret <vscale x 4 x i32> %va
+  }
+
+  define <vscale x 8 x i32> @vload_nx8i32(ptr %pa) #0 {
+    %va = load <vscale x 8 x i32>, ptr %pa, align 32
+    ret <vscale x 8 x i32> %va
+  }
+
+  define <vscale x 16 x i32> @vload_nx16i32(ptr %pa) #0 {
+    %va = load <vscale x 16 x i32>, ptr %pa, align 64
+    ret <vscale x 16 x i32> %va
+  }
+
+  define <vscale x 1 x i64> @vload_nx1i64(ptr %pa) #0 {
+    %va = load <vscale x 1 x i64>, ptr %pa, align 8
+    ret <vscale x 1 x i64> %va
+  }
+
+  define <vscale x 2 x i64> @vload_nx2i64(ptr %pa) #0 {
+    %va = load <vscale x 2 x i64>, ptr %pa, align 16
+    ret <vscale x 2 x i64> %va
+  }
+
+  define <vscale x 4 x i64> @vload_nx4i64(ptr %pa) #0 {
+    %va = load <vscale x 4 x i64>, ptr %pa, align 32
+    ret <vscale x 4 x i64> %va
+  }
+
+  define <vscale x 8 x i64> @vload_nx8i64(ptr %pa) #0 {
+    %va = load <vscale x 8 x i64>, ptr %pa, align 64
+    ret <vscale x 8 x i64> %va
+  }
+
+  define <vscale x 16 x i8> @vload_nx16i8_align1(ptr %pa) #0 {
+    %va = load <vscale x 16 x i8>, ptr %pa, align 1
+    ret <vscale x 16 x i8> %va
+  }
+
+  define <vscale x 16 x i8> @vload_nx16i8_align2(ptr %pa) #0 {
+    %va = load <vscale x 16 x i8>, ptr %pa, align 2
+    ret <vscale x 16 x i8> %va
+  }
+
+  define <vscale x 16 x i8> @vload_nx16i8_align16(ptr %pa) #0 {
+    %va = load <vscale x 16 x i8>, ptr %pa, align 16
+    ret <vscale x 16 x i8> %va
+  }
+
+  define <vscale x 16 x i8> @vload_nx16i8_align64(ptr %pa) #0 {
+    %va = load <vscale x 16 x i8>, ptr %pa, align 64
+    ret <vscale x 16 x i8> %va
+  }
+
+  define <vscale x 4 x i16> @vload_nx4i16_align1(ptr %pa) #0 {
+    %va = load <vscale x 4 x i16>, ptr %pa, align 1
+    ret <vscale x 4 x i16> %va
+  }
+
+  define <vscale x 4 x i16> @vload_nx4i16_align2(ptr %pa) #0 {
+    %va = load <vscale x 4 x i16>, ptr %pa, align 2
+    ret <vscale x 4 x i16> %va
+  }
+
+  define <vscale x 4 x i16> @vload_nx4i16_align4(ptr %pa) #0 {
+    %va = load <vscale x 4 x i16>, ptr %pa, align 4
+    ret <vscale x 4 x i16> %va
+  }
+
+  define <vscale x 4 x i16> @vload_nx4i16_align8(ptr %pa) #0 {
+    %va = load <vscale x 4 x i16>, ptr %pa, align 8
+    ret <vscale x 4 x i16> %va
+  }
+
+  define <vscale x 4 x i16> @vload_nx4i16_align16(ptr %pa) #0 {
+    %va = load <vscale x 4 x i16>, ptr %pa, align 16
+    ret <vscale x 4 x i16> %va
+  }
+
+  define <vscale x 2 x i32> @vload_nx2i32_align2(ptr %pa) #0 {
+    %va = load <vscale x 2 x i32>, ptr %pa, align 2
+    ret <vscale x 2 x i32> %va
+  }
+
+  define <vscale x 2 x i32> @vload_nx2i32_align4(ptr %pa) #0 {
+    %va = load <vscale x 2 x i32>, ptr %pa, align 4
+    ret <vscale x 2 x i32> %va
+  }
+
+  define <vscale x 2 x i32> @vload_nx2i32_align8(ptr %pa) #0 {
+    %va = load <vscale x 2 x i32>, ptr %pa, align 8
+    ret <vscale x 2 x i32> %va
+  }
+
+  define <vscale x 2 x i32> @vload_nx2i32_align16(ptr %pa) #0 {
+    %va = load <vscale x 2 x i32>, ptr %pa, align 16
+    ret <vscale x 2 x i32> %va
+  }
+
+  define <vscale x 2 x i32> @vload_nx2i32_align256(ptr %pa) #0 {
+    %va = load <vscale x 2 x i32>, ptr %pa, align 256
+    ret <vscale x 2 x i32> %va
+  }
+
+  define <vscale x 2 x i64> @vload_nx2i64_align4(ptr %pa) #0 {
+    %va = load <vscale x 2 x i64>, ptr %pa, align 4
+    ret <vscale x 2 x i64> %va
+  }
+
+  define <vscale x 2 x i64> @vload_nx2i64_align8(ptr %pa) #0 {
+    %va = load <vscale x 2 x i64>, ptr %pa, align 8
+    ret <vscale x 2 x i64> %va
+  }
+
+  define <vscale x 2 x i64> @vload_nx2i64_align16(ptr %pa) #0 {
+    %va = load <vscale x 2 x i64>, ptr %pa, align 16
+    ret <vscale x 2 x i64> %va
+  }
+
+  define <vscale x 2 x i64> @vload_nx2i64_align32(ptr %pa) #0 {
+    %va = load <vscale x 2 x i64>, ptr %pa, align 32
+    ret <vscale x 2 x i64> %va
+  }
+
+  define <vscale x 1 x ptr> @vload_nx1ptr(ptr %pa) #0 {
+    %va = load <vscale x 1 x ptr>, ptr %pa, align 4
+    ret <vscale x 1 x ptr> %va
+  }
+
+  define <vscale x 2 x ptr> @vload_nx2ptr(ptr %pa) #0 {
+    %va = load <vscale x 2 x ptr>, ptr %pa, align 8
+    ret <vscale x 2 x ptr> %va
+  }
+
+  define <vscale x 8 x ptr> @vload_nx8ptr(ptr %pa) #0 {
+    %va = load <vscale x 8 x ptr>, ptr %pa, align 32
+    ret <vscale x 8 x ptr> %va
+  }
+
+  attributes #0 = { "target-features"="+v" }
+
+...
+---
+name:            vload_nx1i8
+legalized:       true
+tracksRegLiveness: true
+body:             |
+  bb.1 (%ir-block.0):
+    liveins: $x10
+
+    ; RV32I-LABEL: name: vload_nx1i8
+    ; RV32I: liveins: $x10
+    ; RV32I-NEXT: {{  $}}
+    ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 1 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 1 x s8>) from %ir.pa)
+    ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 1 x s8>)
+    ; RV32I-NEXT: PseudoRET implicit $v8
+    ;
+    ; RV64I-LABEL: name: vload_nx1i8
+    ; RV64I: liveins: $x10
+    ; RV64I-NEXT: {{  $}}
+    ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 1 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 1 x s8>) from %ir.pa)
+    ; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 1 x s8>)
+    ; RV64I-NEXT: PseudoRET implicit $v8
+    %0:gprb(p0) = COPY $x10
+    %2:vrb(p0) = COPY %0(p0)
+    %1:vrb(<vscale x 1 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 1 x s8>) from %ir.pa)
+    $v8 = COPY %1(<vscale x 1 x s8>)
+    PseudoRET implicit $v8
+
+...
+---
+name:            vload_nx2i8
+legalized:       true
+tracksRegLiveness: true
+body:             |
+  bb.1 (%ir-block.0):
+    liveins: $x10
+
+    ; RV32I-LABEL: name: vload_nx2i8
+    ; RV32I: liveins: $x10
+    ; RV32I-NEXT: {{  $}}
+    ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 2 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 2 x s8>) from %ir.pa)
+    ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 2 x s8>)
+    ; RV32I-NEXT: PseudoRET implicit $v8
+    ;
+    ; RV64I-LABEL: name: vload_nx2i8
+    ; RV64I: liveins: $x10
+    ; RV64I-NEXT: {{  $}}
+    ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 2 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 2 x s8>) from %ir.pa)
+    ; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 2 x s8>)
+    ; RV64I-NEXT: PseudoRET implicit $v8
+    %0:gprb(p0) = COPY $x10
+    %2:vrb(p0) = COPY %0(p0)
+    %1:vrb(<vscale x 2 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 2 x s8>) from %ir.pa)
+    $v8 = COPY %1(<vscale x 2 x s8>)
+    PseudoRET implicit $v8
+
+...
+---
+name:            vload_nx4i8
+legalized:       true
+tracksRegLiveness: true
+body:             |
+  bb.1 (%ir-block.0):
+    liveins: $x10
+
+    ; RV32I-LABEL: name: vload_nx4i8
+    ; RV32I: liveins: $x10
+    ; RV32I-NEXT: {{  $}}
+    ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 4 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 4 x s8>) from %ir.pa)
+    ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 4 x s8>)
+    ; RV32I-NEXT: PseudoRET implicit $v8
+    ;
+    ; RV64I-LABEL: name: vload_nx4i8
+    ; RV64I: liveins: $x10
+    ; RV64I-NEXT: {{  $}}
+    ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 4 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 4 x s8>) from %ir.pa)
+    ; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 4 x s8>)
+    ; RV64I-NEXT: PseudoRET implicit $v8
+    %0:gprb(p0) = COPY $x10
+    %2:vrb(p0) = COPY %0(p0)
+    %1:vrb(<vscale x 4 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 4 x s8>) from %ir.pa)
+    $v8 = COPY %1(<vscale x 4 x s8>)
+    PseudoRET implicit $v8
+
+...
+---
+name:            vload_nx8i8
+legalized:       true
+tracksRegLiveness: true
+body:             |
+  bb.1 (%ir-block.0):
+    liveins: $x10
+
+    ; RV32I-LABEL: name: vload_nx8i8
+    ; RV32I: liveins: $x10
+    ; RV32I-NEXT: {{  $}}
+    ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 8 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 8 x s8>) from %ir.pa)
+    ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 8 x s8>)
+    ; RV32I-NEXT: PseudoRET implicit $v8
+    ;
+    ; RV64I-LABEL: name: vload_nx8i8
+    ; RV64I: liveins: $x10
+    ; RV64I-NEXT: {{  $}}
+    ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 8 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 8 x s8>) from %ir.pa)
+    ; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 8 x s8>)
+    ; RV64I-NEXT: PseudoRET implicit $v8
+    %0:gprb(p0) = COPY $x10
+    %2:vrb(p0) = COPY %0(p0)
+    %1:vrb(<vscale x 8 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 8 x s8>) from %ir.pa)
+    $v8 = COPY %1(<vscale x 8 x s8>)
+    PseudoRET implicit $v8
+
+...
+---
+name:            vload_nx16i8
+legalized:       true
+tracksRegLiveness: true
+body:             |
+  bb.1 (%ir-block.0):
+    liveins: $x10
+
+    ; RV32I-LABEL: name: vload_nx16i8
+    ; RV32I: liveins: $x10
+    ; RV32I-NEXT: {{  $}}
+    ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 16 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 16 x s8>) from %ir.pa)
+    ; RV32I-NEXT: $v8m2 = COPY [[LOAD]](<vscale x 16 x s8>)
+    ; RV32I-NEXT: PseudoRET implicit $v8m2
+    ;
+    ; RV64I-LABEL: name: vload_nx16i8
+    ; RV64I: liveins: $x10
+    ; RV64I-NEXT: {{  $}}
+    ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 16 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 16 x s8>) from %ir.pa)
+    ; RV64I-NEXT: $v8m2 = COPY [[LOAD]](<vscale x 16 x s8>)
+    ; RV64I-NEXT: PseudoRET implicit $v8m2
+    %0:gprb(p0) = COPY $x10
+    %2:vrb(p0) = COPY %0(p0)
+    %1:vrb(<vscale x 16 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 16 x s8>) from %ir.pa)
+    $v8m2 = COPY %1(<vscale x 16 x s8>)
+    PseudoRET implicit $v8m2
+
+...
+---
+name:            vload_nx32i8
+legalized:       true
+tracksRegLiveness: true
+body:             |
+  bb.1 (%ir-block.0):
+    liveins: $x10
+
+    ; RV32I-LABEL: name: vload_nx32i8
+    ; RV32I: liveins: $x10
+    ; RV32I-NEXT: {{  $}}
+    ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 32 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 32 x s8>) from %ir.pa)
+    ; RV32I-NEXT: $v8m4 = COPY [[LOAD]](<vscale x 32 x s8>)
+    ; RV32I-NEXT: PseudoRET implicit $v8m4
+    ;
+    ; RV64I-LABEL: name: vload_nx32i8
+    ; RV64I: liveins: $x10
+    ; RV64I-NEXT: {{  $}}
+    ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 32 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 32 x s8>) from %ir.pa)
+    ; RV64I-NEXT: $v8m4 = COPY [[LOAD]](<vscale x 32 x s8>)
+    ; RV64I-NEXT: PseudoRET implicit $v8m4
+    %0:gprb(p0) = COPY $x10
+    %2:vrb(p0) = COPY %0(p0)
+    %1:vrb(<vscale x 32 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 32 x s8>) from %ir.pa)
+    $v8m4 = COPY %1(<vscale x 32 x s8>)
+    PseudoRET implicit $v8m4
+
+...
+---
+name:            vload_nx64i8
+legalized:       true
+tracksRegLiveness: true
+body:             |
+  bb.1 (%ir-block.0):
+    liveins: $x10
+
+    ; RV32I-LABEL: name: vload_nx64i8
+    ; RV32I: liveins: $x10
+    ; RV32I-NEXT: {{  $}}
+    ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 64 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 64 x s8>) from %ir.pa)
+    ; RV32I-NEXT: $v8m8 = COPY [[LOAD]](<vscale x 64 x s8>)
+    ; RV32I-NEXT: PseudoRET implicit $v8m8
+    ;
+    ; RV64I-LABEL: name: vload_nx64i8
+    ; RV64I: liveins: $x10
+    ; RV64I-NEXT: {{  $}}
+    ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 64 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 64 x s8>) from %ir.pa)
+    ; RV64I-NEXT: $v8m8 = COPY [[LOAD]](<vscale x 64 x s8>)
+    ; RV64I-NEXT: PseudoRET implicit $v8m8
+    %0:gprb(p0) = COPY $x10
+    %2:vrb(p0) = COPY %0(p0)
+    %1:vrb(<vscale x 64 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 64 x s8>) from %ir.pa)
+    $v8m8 = COPY %1(<vscale x 64 x s8>)
+    PseudoRET implicit $v8m8
+
+...
+---
+name:            vload_nx1i16
+legalized:       true
+tracksRegLiveness: true
+body:             |
+  bb.1 (%ir-block.0):
+    liveins: $x10
+
+    ; RV32I-LABEL: name: vload_nx1i16
+    ; RV32I: liveins: $x10
+    ; RV32I-NEXT: {{  $}}
+    ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+    ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+    ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+    ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 1 x s16>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 1 x s16>) from %ir.pa)
+    ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 1 x s16>)
+    ; RV32I-NEXT: PseudoRET implicit $v8
+    ;
+    ; RV64I-LABEL: name: vload_nx1i16
+    ; RV64I: liveins: $x10
+    ; RV64I-NEXT: {{  $}}
+    ; RV64I-NEX...
[truncated]

@jiahanxie353
Copy link
Contributor Author

Tests not passing only because they failed on MIParser. They will pass once that change in the legalizer patch gets merged.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jiahanxie353 jiahanxie353 merged commit 1c66ef9 into llvm:main Jul 31, 2024
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants