-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[RISCV][GlobalISel] Legalize Scalable Vector Loads and Stores #84965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8c85470
to
87504bb
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
✅ With the latest revision this PR passed the Python code formatter. |
Can you please update PR description? |
@llvm/pr-subscribers-llvm-globalisel @llvm/pr-subscribers-backend-risc-v Author: Jiahan Xie (jiahanxie353) ChangesThis patch works on legalizing load instruction for scalable vectors Full diff: https://github.com/llvm/llvm-project/pull/84965.diff 5 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h b/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
index 6afaea3f3fc5c6..5d60f4f1829397 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
@@ -220,7 +220,8 @@ struct TypePairAndMemDesc {
Align >= Other.Align &&
// FIXME: This perhaps should be stricter, but the current legality
// rules are written only considering the size.
- MemTy.getSizeInBits() == Other.MemTy.getSizeInBits();
+ MemTy.getSizeInBits().getKnownMinValue() ==
+ Other.MemTy.getSizeInBits().getKnownMinValue();
}
};
diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 1b25da8833e4fb..fb18a2fd4d3e8c 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -3330,8 +3330,12 @@ LegalizerHelper::LegalizeResult LegalizerHelper::lowerLoad(GAnyLoad &LoadMI) {
LLT MemTy = MMO.getMemoryType();
MachineFunction &MF = MIRBuilder.getMF();
- unsigned MemSizeInBits = MemTy.getSizeInBits();
- unsigned MemStoreSizeInBits = 8 * MemTy.getSizeInBytes();
+ unsigned MemSizeInBits = MemTy.isScalable()
+ ? MemTy.getSizeInBits().getKnownMinValue()
+ : MemTy.getSizeInBits();
+ unsigned MemStoreSizeInBits =
+ MemTy.isScalable() ? 8 * MemTy.getSizeInBytes().getKnownMinValue()
+ : 8 * MemTy.getSizeInBytes();
if (MemSizeInBits != MemStoreSizeInBits) {
if (MemTy.isVector())
diff --git a/llvm/lib/CodeGen/MIRParser/MIParser.cpp b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
index 691c60d22724f3..43f6d3219bc6da 100644
--- a/llvm/lib/CodeGen/MIRParser/MIParser.cpp
+++ b/llvm/lib/CodeGen/MIRParser/MIParser.cpp
@@ -3415,7 +3415,7 @@ bool MIParser::parseMachineMemoryOperand(MachineMemOperand *&Dest) {
if (expectAndConsume(MIToken::rparen))
return true;
- Size = MemoryType.getSizeInBytes();
+ Size = MemoryType.getSizeInBytes().getKnownMinValue();
}
MachinePointerInfo Ptr = MachinePointerInfo();
diff --git a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
index 64ae4e94a8c929..bcded4b227e7ef 100644
--- a/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+++ b/llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
@@ -210,12 +210,12 @@ RISCVLegalizerInfo::RISCVLegalizerInfo(const RISCVSubtarget &ST)
.clampScalar(0, s32, (XLen == 64 || ST.hasStdExtD()) ? s64 : s32)
.clampScalar(1, sXLen, sXLen);
- auto &LoadStoreActions =
- getActionDefinitionsBuilder({G_LOAD, G_STORE})
- .legalForTypesWithMemDesc({{s32, p0, s8, 8},
- {s32, p0, s16, 16},
- {s32, p0, s32, 32},
- {p0, p0, sXLen, XLen}});
+ auto &LoadStoreActions = getActionDefinitionsBuilder({G_LOAD, G_STORE})
+ .legalForTypesWithMemDesc({{s32, p0, s8, 8},
+ {s32, p0, s16, 16},
+ {s32, p0, s32, 32},
+ {p0, p0, sXLen, XLen},
+ {nxv1s8, p0, s8, 8}});
auto &ExtLoadActions =
getActionDefinitionsBuilder({G_SEXTLOAD, G_ZEXTLOAD})
.legalForTypesWithMemDesc({{s32, p0, s8, 8}, {s32, p0, s16, 16}});
diff --git a/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-load.mir b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-load.mir
new file mode 100644
index 00000000000000..1b62f555207165
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-load.mir
@@ -0,0 +1,29 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=riscv32 -mattr=+v -run-pass=legalizer %s -o - | FileCheck %s
+# RUN: llc -mtriple=riscv64 -mattr=+v -run-pass=legalizer %s -o - | FileCheck %s
+--- |
+
+ define <vscale x 1 x i8> @vload_nx1i8(ptr %pa) {
+ %va = load <vscale x 1 x i8>, ptr %pa
+ ret <vscale x 1 x i8> %va
+ }
+
+...
+---
+name: vload_nx1i8
+body: |
+ bb.1 (%ir-block.0):
+ liveins: $x10
+
+ ; CHECK-LABEL: name: vload_nx1i8
+ ; CHECK: liveins: $x10
+ ; CHECK-NEXT: {{ $}}
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x10
+ ; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(<vscale x 1 x s8>) = G_LOAD [[COPY]](p0) :: (load (<vscale x 1 x s8>) from %ir.pa)
+ ; CHECK-NEXT: $v8 = COPY [[LOAD]](<vscale x 1 x s8>)
+ ; CHECK-NEXT: PseudoRET implicit $v8
+ %0:_(p0) = COPY $x10
+ %1:_(<vscale x 1 x s8>) = G_LOAD %0(p0) :: (load (<vscale x 1 x s8>) from %ir.pa)
+ $v8 = COPY %1(<vscale x 1 x s8>)
+ PseudoRET implicit $v8
+
|
87504bb
to
9ce3a84
Compare
@@ -3330,16 +3330,17 @@ LegalizerHelper::LegalizeResult LegalizerHelper::lowerLoad(GAnyLoad &LoadMI) { | |||
LLT MemTy = MMO.getMemoryType(); | |||
MachineFunction &MF = MIRBuilder.getMF(); | |||
|
|||
unsigned MemSizeInBits = MemTy.getSizeInBits(); | |||
unsigned MemStoreSizeInBits = 8 * MemTy.getSizeInBytes(); | |||
unsigned MinMemSizeInBits = MemTy.getSizeInBits().getKnownMinValue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do this as TypeSize MemSizeInBits
and TypeSize MinMemStoreSizeInBits
as Craig Pointed out, which allows us to avoid calling getKnownMinValue. That way we avoid the case comparing fixed vector and scalable vector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood.
I used getKnownMinValue
because we have cases for doing scalar arithmetic operations below. For example:
llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
Lines 3397 to 3411 in 9ce3a84
if (!isPowerOf2_32(MinMemSizeInBits)) { | |
// This load needs splitting into power of 2 sized loads. | |
LargeSplitSize = llvm::bit_floor(MinMemSizeInBits); | |
SmallSplitSize = MinMemSizeInBits - LargeSplitSize; | |
} else { | |
// This is already a power of 2, but we still need to split this in half. | |
// | |
// Assume we're being asked to decompose an unaligned load. | |
// TODO: If this requires multiple splits, handle them all at once. | |
auto &Ctx = MF.getFunction().getContext(); | |
if (TLI.allowsMemoryAccess(Ctx, MIRBuilder.getDataLayout(), MemTy, MMO)) | |
return UnableToLegalize; | |
SmallSplitSize = LargeSplitSize = MinMemSizeInBits / 2; | |
} |
bit_floor
has to take a scalar.
So are you suggesting we make them TypeSize MemSizeInBits
but call getKnownMinValue
on demand?
699b864
to
aab0ab4
Compare
When we have alignment restrictions for But to get such information, we need to have Can I get some help on this? Thanks! |
Check out These are all available in |
The DataLayout belongs to Module. The LegalizerInfo constructor runs before the Module exists or at least without knowledge of the Module. I don't think you can call allowsMemoryAccessForAlignment. A misaligned vector can be handled by converting it to a vle8. Not sure if the generic legalization can handle that or we need to do custom legalization for RISC-V. SelectionDAG uses custom legalization, see RISCVTargetLowering::expandUnalignedRVVLoad |
3c8423d
to
eee1b24
Compare
} | ||
|
||
define <vscale x 16 x i8> @vload_nx16i8(ptr %pa) { | ||
%va = load <vscale x 16 x i8>, ptr %pa, align 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should these be store instructions?
llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-store.mir
Outdated
Show resolved
Hide resolved
|
||
DstOp NewDstReg(NewLoadTy); | ||
MachineMemOperand *NewLoadMMO = MF->getMachineMemOperand( | ||
MachinePointerInfo(), MachineMemOperand::MOLoad, NewLoadTy, Alignment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be casting the original MI to GLoad and building the MachineMemOperand
using existing info. It looks like RISCVTargetLowering::ExpandUnalignedRVVLoad
uses the pointer info from the original load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MachinePointerInfo PI = cast<GLoad>(MI)->getMMO()->getPointerInfo();
MachineMemOperand *NewLoadMMO = MF->getMachineMemOperand(PI, MachineMemOperand::MOLoad, NewLoadTy, Alignment)
"Load instructions only have one MemOperand."); | ||
Align Alignment = (*MI.memoperands_begin())->getAlign(); | ||
MachineMemOperand *LoadMMO = MF->getMachineMemOperand( | ||
MachinePointerInfo(), MachineMemOperand::MOLoad, LoadTy, Alignment); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be casting the original MI to GLoad and building the MachineMemOperand using existing info. It looks like RISCVTargetLowering::ExpandUnalignedRVVLoad uses the mem operand from the original load.
|
||
auto NewLoad = MIB.buildLoad(NewDstReg, PtrReg, *NewLoadMMO); | ||
|
||
MIB.buildBitcast(DstReg, NewLoad.getReg(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might be able to pass NewLoad
here instead of NewLoad.getReg(0)
.
MachineMemOperand *NewLoadMMO = MF->getMachineMemOperand( | ||
MachinePointerInfo(), MachineMemOperand::MOLoad, NewLoadTy, Alignment); | ||
|
||
auto NewLoad = MIB.buildLoad(NewDstReg, PtrReg, *NewLoadMMO); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are using this function for both load and store but the logic in it is only assuming it is legalizing a load.
llvm/test/CodeGen/RISCV/GlobalISel/legalizer/rvv/legalize-load.mir
Outdated
Show resolved
Hide resolved
eee1b24
to
faeb464
Compare
if (TLI->allowsMemoryAccessForAlignment(Ctx, DL, VT, *MMO)) | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should really be considered in the default lower action. I've had a patch for years I need to get back to
faeb464
to
44a1b70
Compare
7501c8c
to
9d42002
Compare
|
||
LoadStoreActions.widenScalarToNextPow2(0, /* MinSize = */ 8) | ||
.lowerIfMemSizeNotByteSizePow2() | ||
.customIf(all(LegalityPredicate([=](const LegalityQuery &Query) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IIUC the customIf
is to handle vector types that have non-standard align arguments. If that is an accurate description, could you add a comment clarifying this point for the reader? If someone isn't familiar with the code, it might not be clear why we are doing this special custom legalization for vector types that we addressed above.
@@ -3690,7 +3690,7 @@ LegalizerHelper::LegalizeResult LegalizerHelper::lowerLoad(GAnyLoad &LoadMI) { | |||
|
|||
if (!isPowerOf2_32(MemSizeInBits)) { | |||
// This load needs splitting into power of 2 sized loads. | |||
LargeSplitSize = llvm::bit_floor(MemSizeInBits); | |||
LargeSplitSize = llvm::bit_floor(MemSizeInBits.getKnownMinValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These values are used in the call to getMachineMemOperand. There is a signature:
MachineMemOperand *MachineFunction::getMachineMemOperand(
const MachineMemOperand *MMO, const MachinePointerInfo &PtrInfo, LLT Ty)
It might be safer to create a scalable MMO using Ty
rather than forcing it into fixed.
Ditto for the other instructions that are built with LargeSplitSize
and SmallSplitSize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean we should use LLT LargeSplitSize, SmallSplitSize
instead of uint64_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. Otherwise we risk loosing out on the fact that the size is actually scalable.
e86058f
to
12b0386
Compare
@@ -3690,7 +3690,7 @@ LegalizerHelper::LegalizeResult LegalizerHelper::lowerLoad(GAnyLoad &LoadMI) { | |||
|
|||
if (!isPowerOf2_32(MemSizeInBits)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this line need to be be updated if MemSizeInBits is a scalable size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, and I'm still pondering how to do it.
Don't have a super clear idea about how to do it. Partly because all of our test cases are captured in the custom
branch so they will enter this lower
branch (to be honest, I can just revert all my current changes in this file and support it later when there's an actual need); also partly because I don't quite understand why they require this load splitting into power of 2 sized loads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok let's drop the changes if they aren't tested.
12b0386
to
c0aaf55
Compare
Just dropped it. Marking it ready for the final review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This patch works on legalizing load and store instruction for scalable vectors