-
Notifications
You must be signed in to change notification settings - Fork 14.2k
[GISEL][RISCV] RegBank Scalable Vector Load/Store #99932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
460c670
to
28c2cc5
Compare
// Use FPR64 for s64 loads on rv32. | ||
if (GPRSize == 32 && Ty.getSizeInBits() == 64) { | ||
if (GPRSize == 32 && Ty.getSizeInBits().getKnownMinValue() == 64 && | ||
!Ty.isVector()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not confident about this if
statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check !Ty.isVector()
before Ty.getSizeInBits()
and drop the getKnownMinValue()
. Scalar types should never be scalable.
// Use FPR64 for s64 loads on rv32. | ||
if (GPRSize == 32 && Ty.getSizeInBits() == 64) { | ||
if (GPRSize == 32 && Ty.getSizeInBits().getKnownMinValue() == 64 && | ||
!Ty.isVector()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check !Ty.isVector()
before Ty.getSizeInBits()
and drop the getKnownMinValue()
. Scalar types should never be scalable.
28c2cc5
to
9934fad
Compare
; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 2 x s8>) | ||
; RV64I-NEXT: PseudoRET implicit $v8 | ||
%0:gprb(p0) = COPY $x10 | ||
%2:vrb(p0) = COPY %0(p0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why p0
is copied to vrb
. Although p0
is a pointer to a scalable vector, which should be in vrb
; p0
itself is not scalable. Just wondering..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case TargetOpcode::G_LOAD: {
LLT Ty = MRI.getType(MI.getOperand(0).getReg());
TypeSize Size = Ty.getSizeInBits();
if (Ty.isVector()) {
OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue());
OpdsMapping[1] = getVRBValueMapping(Size.getKnownMinValue());
You are setting the operand as vector type for the G_LOAD, which causes the copy to think it is a vector type, since we do analysis bottom up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense, thanks @michaelmaitland and @topperc !
// Use FPR64 for s64 loads on rv32. | ||
if (GPRSize == 32 && Ty.getSizeInBits() == 64) { | ||
if (!Ty.isVector() && GPRSize == 32 && | ||
Ty.getSizeInBits().getKnownMinValue() == 64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use getFixedValue() instead of getKnownMinValue(). It should not be scalable since we already checked !Ty.isVector().
// Use FPR64 for s64 stores on rv32. | ||
if (GPRSize == 32 && Ty.getSizeInBits() == 64) { | ||
if (!Ty.isVector() && GPRSize == 32 && | ||
Ty.getSizeInBits().getKnownMinValue() == 64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use getFixedValue()
TypeSize Size = Ty.getSizeInBits(); | ||
if (Ty.isVector()) { | ||
OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue()); | ||
OpdsMapping[1] = getVRBValueMapping(Size.getKnownMinValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpsMapping[1] should always be GPRValueMapping
TypeSize Size = Ty.getSizeInBits(); | ||
if (Ty.isVector()) { | ||
OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue()); | ||
OpdsMapping[1] = getVRBValueMapping(Size.getKnownMinValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpsMapping[1] should always be GPRValueMapping
OpdsMapping[1] = GPRValueMapping; | ||
// Use FPR64 for s64 loads on rv32. | ||
if (GPRSize == 32 && Ty.getSizeInBits() == 64) { | ||
if (!Ty.isVector() && GPRSize == 32 && | ||
Ty.getSizeInBits().getFixedValue() == 64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size.getFixedValue()
OpdsMapping[1] = GPRValueMapping; | ||
// Use FPR64 for s64 stores on rv32. | ||
if (GPRSize == 32 && Ty.getSizeInBits() == 64) { | ||
if (!Ty.isVector() && GPRSize == 32 && | ||
Ty.getSizeInBits().getFixedValue() == 64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size.getFixedValue()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-llvm-regalloc Author: Jiahan Xie (jiahanxie353) ChangesThis patch supports GlobalISel for RegBank selection for Scalable Vector load and store instructions. Patch is 137.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/99932.diff 4 Files Affected:
diff --git a/llvm/lib/CodeGen/RegisterBankInfo.cpp b/llvm/lib/CodeGen/RegisterBankInfo.cpp
index 72b07eb1902d9..00dcc1fbcd0c7 100644
--- a/llvm/lib/CodeGen/RegisterBankInfo.cpp
+++ b/llvm/lib/CodeGen/RegisterBankInfo.cpp
@@ -215,8 +215,9 @@ RegisterBankInfo::getInstrMappingImpl(const MachineInstr &MI) const {
}
}
- unsigned Size = getSizeInBits(Reg, MRI, TRI);
- const ValueMapping *ValMapping = &getValueMapping(0, Size, *CurRegBank);
+ TypeSize Size = getSizeInBits(Reg, MRI, TRI);
+ const ValueMapping *ValMapping =
+ &getValueMapping(0, Size.getKnownMinValue(), *CurRegBank);
if (IsCopyLike) {
if (!OperandsMapping[0]) {
if (MI.isRegSequence()) {
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
index 43bbc8589e7e2..f7279bbbd6488 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVRegisterBankInfo.cpp
@@ -310,10 +310,15 @@ RISCVRegisterBankInfo::getInstrMapping(const MachineInstr &MI) const {
switch (Opc) {
case TargetOpcode::G_LOAD: {
LLT Ty = MRI.getType(MI.getOperand(0).getReg());
- OpdsMapping[0] = GPRValueMapping;
+ TypeSize Size = Ty.getSizeInBits();
+ if (Ty.isVector())
+ OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue());
+ else
+ OpdsMapping[0] = GPRValueMapping;
+
OpdsMapping[1] = GPRValueMapping;
// Use FPR64 for s64 loads on rv32.
- if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
+ if (!Ty.isVector() && GPRSize == 32 && Size.getFixedValue() == 64) {
assert(MF.getSubtarget<RISCVSubtarget>().hasStdExtD());
OpdsMapping[0] = getFPValueMapping(Ty.getSizeInBits());
break;
@@ -333,10 +338,15 @@ RISCVRegisterBankInfo::getInstrMapping(const MachineInstr &MI) const {
}
case TargetOpcode::G_STORE: {
LLT Ty = MRI.getType(MI.getOperand(0).getReg());
- OpdsMapping[0] = GPRValueMapping;
+ TypeSize Size = Ty.getSizeInBits();
+ if (Ty.isVector())
+ OpdsMapping[0] = getVRBValueMapping(Size.getKnownMinValue());
+ else
+ OpdsMapping[0] = GPRValueMapping;
+
OpdsMapping[1] = GPRValueMapping;
// Use FPR64 for s64 stores on rv32.
- if (GPRSize == 32 && Ty.getSizeInBits() == 64) {
+ if (!Ty.isVector() && GPRSize == 32 && Size.getFixedValue() == 64) {
assert(MF.getSubtarget<RISCVSubtarget>().hasStdExtD());
OpdsMapping[0] = getFPValueMapping(Ty.getSizeInBits());
break;
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/regbankselect/rvv/load.mir b/llvm/test/CodeGen/RISCV/GlobalISel/regbankselect/rvv/load.mir
new file mode 100644
index 0000000000000..73ac2702cf9d4
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/regbankselect/rvv/load.mir
@@ -0,0 +1,1698 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=riscv32 -mattr=+m,+v -run-pass=regbankselect \
+# RUN: -disable-gisel-legality-check -simplify-mir -verify-machineinstrs %s \
+# RUN: -o - | FileCheck -check-prefix=RV32I %s
+# RUN: llc -mtriple=riscv64 -mattr=+m,+v -run-pass=regbankselect \
+# RUN: -disable-gisel-legality-check -simplify-mir -verify-machineinstrs %s \
+# RUN: -o - | FileCheck -check-prefix=RV64I %s
+--- |
+
+ define <vscale x 1 x i8> @vload_nx1i8(ptr %pa) #0 {
+ %va = load <vscale x 1 x i8>, ptr %pa, align 1
+ ret <vscale x 1 x i8> %va
+ }
+
+ define <vscale x 2 x i8> @vload_nx2i8(ptr %pa) #0 {
+ %va = load <vscale x 2 x i8>, ptr %pa, align 2
+ ret <vscale x 2 x i8> %va
+ }
+
+ define <vscale x 4 x i8> @vload_nx4i8(ptr %pa) #0 {
+ %va = load <vscale x 4 x i8>, ptr %pa, align 4
+ ret <vscale x 4 x i8> %va
+ }
+
+ define <vscale x 8 x i8> @vload_nx8i8(ptr %pa) #0 {
+ %va = load <vscale x 8 x i8>, ptr %pa, align 8
+ ret <vscale x 8 x i8> %va
+ }
+
+ define <vscale x 16 x i8> @vload_nx16i8(ptr %pa) #0 {
+ %va = load <vscale x 16 x i8>, ptr %pa, align 16
+ ret <vscale x 16 x i8> %va
+ }
+
+ define <vscale x 32 x i8> @vload_nx32i8(ptr %pa) #0 {
+ %va = load <vscale x 32 x i8>, ptr %pa, align 32
+ ret <vscale x 32 x i8> %va
+ }
+
+ define <vscale x 64 x i8> @vload_nx64i8(ptr %pa) #0 {
+ %va = load <vscale x 64 x i8>, ptr %pa, align 64
+ ret <vscale x 64 x i8> %va
+ }
+
+ define <vscale x 1 x i16> @vload_nx1i16(ptr %pa) #0 {
+ %va = load <vscale x 1 x i16>, ptr %pa, align 2
+ ret <vscale x 1 x i16> %va
+ }
+
+ define <vscale x 2 x i16> @vload_nx2i16(ptr %pa) #0 {
+ %va = load <vscale x 2 x i16>, ptr %pa, align 4
+ ret <vscale x 2 x i16> %va
+ }
+
+ define <vscale x 4 x i16> @vload_nx4i16(ptr %pa) #0 {
+ %va = load <vscale x 4 x i16>, ptr %pa, align 8
+ ret <vscale x 4 x i16> %va
+ }
+
+ define <vscale x 8 x i16> @vload_nx8i16(ptr %pa) #0 {
+ %va = load <vscale x 8 x i16>, ptr %pa, align 16
+ ret <vscale x 8 x i16> %va
+ }
+
+ define <vscale x 16 x i16> @vload_nx16i16(ptr %pa) #0 {
+ %va = load <vscale x 16 x i16>, ptr %pa, align 32
+ ret <vscale x 16 x i16> %va
+ }
+
+ define <vscale x 32 x i16> @vload_nx32i16(ptr %pa) #0 {
+ %va = load <vscale x 32 x i16>, ptr %pa, align 64
+ ret <vscale x 32 x i16> %va
+ }
+
+ define <vscale x 1 x i32> @vload_nx1i32(ptr %pa) #0 {
+ %va = load <vscale x 1 x i32>, ptr %pa, align 4
+ ret <vscale x 1 x i32> %va
+ }
+
+ define <vscale x 2 x i32> @vload_nx2i32(ptr %pa) #0 {
+ %va = load <vscale x 2 x i32>, ptr %pa, align 8
+ ret <vscale x 2 x i32> %va
+ }
+
+ define <vscale x 4 x i32> @vload_nx4i32(ptr %pa) #0 {
+ %va = load <vscale x 4 x i32>, ptr %pa, align 16
+ ret <vscale x 4 x i32> %va
+ }
+
+ define <vscale x 8 x i32> @vload_nx8i32(ptr %pa) #0 {
+ %va = load <vscale x 8 x i32>, ptr %pa, align 32
+ ret <vscale x 8 x i32> %va
+ }
+
+ define <vscale x 16 x i32> @vload_nx16i32(ptr %pa) #0 {
+ %va = load <vscale x 16 x i32>, ptr %pa, align 64
+ ret <vscale x 16 x i32> %va
+ }
+
+ define <vscale x 1 x i64> @vload_nx1i64(ptr %pa) #0 {
+ %va = load <vscale x 1 x i64>, ptr %pa, align 8
+ ret <vscale x 1 x i64> %va
+ }
+
+ define <vscale x 2 x i64> @vload_nx2i64(ptr %pa) #0 {
+ %va = load <vscale x 2 x i64>, ptr %pa, align 16
+ ret <vscale x 2 x i64> %va
+ }
+
+ define <vscale x 4 x i64> @vload_nx4i64(ptr %pa) #0 {
+ %va = load <vscale x 4 x i64>, ptr %pa, align 32
+ ret <vscale x 4 x i64> %va
+ }
+
+ define <vscale x 8 x i64> @vload_nx8i64(ptr %pa) #0 {
+ %va = load <vscale x 8 x i64>, ptr %pa, align 64
+ ret <vscale x 8 x i64> %va
+ }
+
+ define <vscale x 16 x i8> @vload_nx16i8_align1(ptr %pa) #0 {
+ %va = load <vscale x 16 x i8>, ptr %pa, align 1
+ ret <vscale x 16 x i8> %va
+ }
+
+ define <vscale x 16 x i8> @vload_nx16i8_align2(ptr %pa) #0 {
+ %va = load <vscale x 16 x i8>, ptr %pa, align 2
+ ret <vscale x 16 x i8> %va
+ }
+
+ define <vscale x 16 x i8> @vload_nx16i8_align16(ptr %pa) #0 {
+ %va = load <vscale x 16 x i8>, ptr %pa, align 16
+ ret <vscale x 16 x i8> %va
+ }
+
+ define <vscale x 16 x i8> @vload_nx16i8_align64(ptr %pa) #0 {
+ %va = load <vscale x 16 x i8>, ptr %pa, align 64
+ ret <vscale x 16 x i8> %va
+ }
+
+ define <vscale x 4 x i16> @vload_nx4i16_align1(ptr %pa) #0 {
+ %va = load <vscale x 4 x i16>, ptr %pa, align 1
+ ret <vscale x 4 x i16> %va
+ }
+
+ define <vscale x 4 x i16> @vload_nx4i16_align2(ptr %pa) #0 {
+ %va = load <vscale x 4 x i16>, ptr %pa, align 2
+ ret <vscale x 4 x i16> %va
+ }
+
+ define <vscale x 4 x i16> @vload_nx4i16_align4(ptr %pa) #0 {
+ %va = load <vscale x 4 x i16>, ptr %pa, align 4
+ ret <vscale x 4 x i16> %va
+ }
+
+ define <vscale x 4 x i16> @vload_nx4i16_align8(ptr %pa) #0 {
+ %va = load <vscale x 4 x i16>, ptr %pa, align 8
+ ret <vscale x 4 x i16> %va
+ }
+
+ define <vscale x 4 x i16> @vload_nx4i16_align16(ptr %pa) #0 {
+ %va = load <vscale x 4 x i16>, ptr %pa, align 16
+ ret <vscale x 4 x i16> %va
+ }
+
+ define <vscale x 2 x i32> @vload_nx2i32_align2(ptr %pa) #0 {
+ %va = load <vscale x 2 x i32>, ptr %pa, align 2
+ ret <vscale x 2 x i32> %va
+ }
+
+ define <vscale x 2 x i32> @vload_nx2i32_align4(ptr %pa) #0 {
+ %va = load <vscale x 2 x i32>, ptr %pa, align 4
+ ret <vscale x 2 x i32> %va
+ }
+
+ define <vscale x 2 x i32> @vload_nx2i32_align8(ptr %pa) #0 {
+ %va = load <vscale x 2 x i32>, ptr %pa, align 8
+ ret <vscale x 2 x i32> %va
+ }
+
+ define <vscale x 2 x i32> @vload_nx2i32_align16(ptr %pa) #0 {
+ %va = load <vscale x 2 x i32>, ptr %pa, align 16
+ ret <vscale x 2 x i32> %va
+ }
+
+ define <vscale x 2 x i32> @vload_nx2i32_align256(ptr %pa) #0 {
+ %va = load <vscale x 2 x i32>, ptr %pa, align 256
+ ret <vscale x 2 x i32> %va
+ }
+
+ define <vscale x 2 x i64> @vload_nx2i64_align4(ptr %pa) #0 {
+ %va = load <vscale x 2 x i64>, ptr %pa, align 4
+ ret <vscale x 2 x i64> %va
+ }
+
+ define <vscale x 2 x i64> @vload_nx2i64_align8(ptr %pa) #0 {
+ %va = load <vscale x 2 x i64>, ptr %pa, align 8
+ ret <vscale x 2 x i64> %va
+ }
+
+ define <vscale x 2 x i64> @vload_nx2i64_align16(ptr %pa) #0 {
+ %va = load <vscale x 2 x i64>, ptr %pa, align 16
+ ret <vscale x 2 x i64> %va
+ }
+
+ define <vscale x 2 x i64> @vload_nx2i64_align32(ptr %pa) #0 {
+ %va = load <vscale x 2 x i64>, ptr %pa, align 32
+ ret <vscale x 2 x i64> %va
+ }
+
+ define <vscale x 1 x ptr> @vload_nx1ptr(ptr %pa) #0 {
+ %va = load <vscale x 1 x ptr>, ptr %pa, align 4
+ ret <vscale x 1 x ptr> %va
+ }
+
+ define <vscale x 2 x ptr> @vload_nx2ptr(ptr %pa) #0 {
+ %va = load <vscale x 2 x ptr>, ptr %pa, align 8
+ ret <vscale x 2 x ptr> %va
+ }
+
+ define <vscale x 8 x ptr> @vload_nx8ptr(ptr %pa) #0 {
+ %va = load <vscale x 8 x ptr>, ptr %pa, align 32
+ ret <vscale x 8 x ptr> %va
+ }
+
+ attributes #0 = { "target-features"="+v" }
+
+...
+---
+name: vload_nx1i8
+legalized: true
+tracksRegLiveness: true
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; RV32I-LABEL: name: vload_nx1i8
+ ; RV32I: liveins: $x10
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 1 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 1 x s8>) from %ir.pa)
+ ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 1 x s8>)
+ ; RV32I-NEXT: PseudoRET implicit $v8
+ ;
+ ; RV64I-LABEL: name: vload_nx1i8
+ ; RV64I: liveins: $x10
+ ; RV64I-NEXT: {{ $}}
+ ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 1 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 1 x s8>) from %ir.pa)
+ ; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 1 x s8>)
+ ; RV64I-NEXT: PseudoRET implicit $v8
+ %0:gprb(p0) = COPY $x10
+ %2:vrb(p0) = COPY %0(p0)
+ %1:vrb(<vscale x 1 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 1 x s8>) from %ir.pa)
+ $v8 = COPY %1(<vscale x 1 x s8>)
+ PseudoRET implicit $v8
+
+...
+---
+name: vload_nx2i8
+legalized: true
+tracksRegLiveness: true
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; RV32I-LABEL: name: vload_nx2i8
+ ; RV32I: liveins: $x10
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 2 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 2 x s8>) from %ir.pa)
+ ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 2 x s8>)
+ ; RV32I-NEXT: PseudoRET implicit $v8
+ ;
+ ; RV64I-LABEL: name: vload_nx2i8
+ ; RV64I: liveins: $x10
+ ; RV64I-NEXT: {{ $}}
+ ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 2 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 2 x s8>) from %ir.pa)
+ ; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 2 x s8>)
+ ; RV64I-NEXT: PseudoRET implicit $v8
+ %0:gprb(p0) = COPY $x10
+ %2:vrb(p0) = COPY %0(p0)
+ %1:vrb(<vscale x 2 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 2 x s8>) from %ir.pa)
+ $v8 = COPY %1(<vscale x 2 x s8>)
+ PseudoRET implicit $v8
+
+...
+---
+name: vload_nx4i8
+legalized: true
+tracksRegLiveness: true
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; RV32I-LABEL: name: vload_nx4i8
+ ; RV32I: liveins: $x10
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 4 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 4 x s8>) from %ir.pa)
+ ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 4 x s8>)
+ ; RV32I-NEXT: PseudoRET implicit $v8
+ ;
+ ; RV64I-LABEL: name: vload_nx4i8
+ ; RV64I: liveins: $x10
+ ; RV64I-NEXT: {{ $}}
+ ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 4 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 4 x s8>) from %ir.pa)
+ ; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 4 x s8>)
+ ; RV64I-NEXT: PseudoRET implicit $v8
+ %0:gprb(p0) = COPY $x10
+ %2:vrb(p0) = COPY %0(p0)
+ %1:vrb(<vscale x 4 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 4 x s8>) from %ir.pa)
+ $v8 = COPY %1(<vscale x 4 x s8>)
+ PseudoRET implicit $v8
+
+...
+---
+name: vload_nx8i8
+legalized: true
+tracksRegLiveness: true
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; RV32I-LABEL: name: vload_nx8i8
+ ; RV32I: liveins: $x10
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 8 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 8 x s8>) from %ir.pa)
+ ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 8 x s8>)
+ ; RV32I-NEXT: PseudoRET implicit $v8
+ ;
+ ; RV64I-LABEL: name: vload_nx8i8
+ ; RV64I: liveins: $x10
+ ; RV64I-NEXT: {{ $}}
+ ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 8 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 8 x s8>) from %ir.pa)
+ ; RV64I-NEXT: $v8 = COPY [[LOAD]](<vscale x 8 x s8>)
+ ; RV64I-NEXT: PseudoRET implicit $v8
+ %0:gprb(p0) = COPY $x10
+ %2:vrb(p0) = COPY %0(p0)
+ %1:vrb(<vscale x 8 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 8 x s8>) from %ir.pa)
+ $v8 = COPY %1(<vscale x 8 x s8>)
+ PseudoRET implicit $v8
+
+...
+---
+name: vload_nx16i8
+legalized: true
+tracksRegLiveness: true
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; RV32I-LABEL: name: vload_nx16i8
+ ; RV32I: liveins: $x10
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 16 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 16 x s8>) from %ir.pa)
+ ; RV32I-NEXT: $v8m2 = COPY [[LOAD]](<vscale x 16 x s8>)
+ ; RV32I-NEXT: PseudoRET implicit $v8m2
+ ;
+ ; RV64I-LABEL: name: vload_nx16i8
+ ; RV64I: liveins: $x10
+ ; RV64I-NEXT: {{ $}}
+ ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 16 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 16 x s8>) from %ir.pa)
+ ; RV64I-NEXT: $v8m2 = COPY [[LOAD]](<vscale x 16 x s8>)
+ ; RV64I-NEXT: PseudoRET implicit $v8m2
+ %0:gprb(p0) = COPY $x10
+ %2:vrb(p0) = COPY %0(p0)
+ %1:vrb(<vscale x 16 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 16 x s8>) from %ir.pa)
+ $v8m2 = COPY %1(<vscale x 16 x s8>)
+ PseudoRET implicit $v8m2
+
+...
+---
+name: vload_nx32i8
+legalized: true
+tracksRegLiveness: true
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; RV32I-LABEL: name: vload_nx32i8
+ ; RV32I: liveins: $x10
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 32 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 32 x s8>) from %ir.pa)
+ ; RV32I-NEXT: $v8m4 = COPY [[LOAD]](<vscale x 32 x s8>)
+ ; RV32I-NEXT: PseudoRET implicit $v8m4
+ ;
+ ; RV64I-LABEL: name: vload_nx32i8
+ ; RV64I: liveins: $x10
+ ; RV64I-NEXT: {{ $}}
+ ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 32 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 32 x s8>) from %ir.pa)
+ ; RV64I-NEXT: $v8m4 = COPY [[LOAD]](<vscale x 32 x s8>)
+ ; RV64I-NEXT: PseudoRET implicit $v8m4
+ %0:gprb(p0) = COPY $x10
+ %2:vrb(p0) = COPY %0(p0)
+ %1:vrb(<vscale x 32 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 32 x s8>) from %ir.pa)
+ $v8m4 = COPY %1(<vscale x 32 x s8>)
+ PseudoRET implicit $v8m4
+
+...
+---
+name: vload_nx64i8
+legalized: true
+tracksRegLiveness: true
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; RV32I-LABEL: name: vload_nx64i8
+ ; RV32I: liveins: $x10
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 64 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 64 x s8>) from %ir.pa)
+ ; RV32I-NEXT: $v8m8 = COPY [[LOAD]](<vscale x 64 x s8>)
+ ; RV32I-NEXT: PseudoRET implicit $v8m8
+ ;
+ ; RV64I-LABEL: name: vload_nx64i8
+ ; RV64I: liveins: $x10
+ ; RV64I-NEXT: {{ $}}
+ ; RV64I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV64I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV64I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV64I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 64 x s8>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 64 x s8>) from %ir.pa)
+ ; RV64I-NEXT: $v8m8 = COPY [[LOAD]](<vscale x 64 x s8>)
+ ; RV64I-NEXT: PseudoRET implicit $v8m8
+ %0:gprb(p0) = COPY $x10
+ %2:vrb(p0) = COPY %0(p0)
+ %1:vrb(<vscale x 64 x s8>) = G_LOAD %2(p0) :: (load (<vscale x 64 x s8>) from %ir.pa)
+ $v8m8 = COPY %1(<vscale x 64 x s8>)
+ PseudoRET implicit $v8m8
+
+...
+---
+name: vload_nx1i16
+legalized: true
+tracksRegLiveness: true
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; RV32I-LABEL: name: vload_nx1i16
+ ; RV32I: liveins: $x10
+ ; RV32I-NEXT: {{ $}}
+ ; RV32I-NEXT: [[COPY:%[0-9]+]]:gprb(p0) = COPY $x10
+ ; RV32I-NEXT: [[COPY1:%[0-9]+]]:vrb(p0) = COPY [[COPY]](p0)
+ ; RV32I-NEXT: [[COPY2:%[0-9]+]]:gprb(p0) = COPY [[COPY1]](p0)
+ ; RV32I-NEXT: [[LOAD:%[0-9]+]]:vrb(<vscale x 1 x s16>) = G_LOAD [[COPY2]](p0) :: (load (<vscale x 1 x s16>) from %ir.pa)
+ ; RV32I-NEXT: $v8 = COPY [[LOAD]](<vscale x 1 x s16>)
+ ; RV32I-NEXT: PseudoRET implicit $v8
+ ;
+ ; RV64I-LABEL: name: vload_nx1i16
+ ; RV64I: liveins: $x10
+ ; RV64I-NEXT: {{ $}}
+ ; RV64I-NEX...
[truncated]
|
df003fd
to
ce12764
Compare
Tests not passing only because they failed on MIParser. They will pass once that change in the legalizer patch gets merged. |
ce12764
to
80a710e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
80a710e
to
ce3892a
Compare
ce3892a
to
2fe4c15
Compare
This patch supports GlobalISel for RegBank selection for Scalable Vector load and store instructions.