-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[AArch64] Separate PNR into its own Register Class #65306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
if ((DestReg.id() - AArch64::PN0) == (SrcReg.id() - AArch64::P0)) | ||
return; | ||
DestReg = DestReg - AArch64::PN0 + AArch64::P0; | ||
BuildMI(MBB, I, DL, get(AArch64::ORR_PPzPP), DestReg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also add an implicitDef of the corresponding predicate-as-counter register?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it should, I've added it
def PPR32 : PPRRegOp<"s", PPRAsmOp32, ElementSizeS, PPR>; | ||
def PPR64 : PPRRegOp<"d", PPRAsmOp64, ElementSizeD, PPR>; | ||
|
||
def PPRAsmOpAny : PPRAsmOperand<"PredicateAny", "PPR", 0>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is for line 921: def PPR_p8to15 : PPRClass<8, 15>;
Is this register class still used somewhere? I thought we added this for instructions that take predicate-as-counter registers (which are passed in pn8-pn15).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of, It was used in c52d950 because there was no alternative PNR class to use at the time. It's probably not very useful now, unless you still want to be able to constrain the upper end of normal predicate when writing asm? Maybe its useful with some of the SVE2p1/SME2 strided/contiguous load instructions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if it's not used at the moment, please remove it from this patch. If it's needed in the future we can always add it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, just ignore my comment above. The code is already in and used in getPredicateRegisterClass
for inline-asm constraints. I'm not sure if anyone will ever use this for mask-predicates, but I guess there's no need to remove that functionality.
@@ -1348,9 +1348,14 @@ void AArch64InstPrinter::printPredicateAsCounter(const MCInst *MI, | |||
const MCSubtargetInfo &STI, | |||
raw_ostream &O) { | |||
unsigned Reg = MI->getOperand(OpNum).getReg(); | |||
O << "pn"; | |||
if (Reg >= AArch64::P0 && Reg <= AArch64::P15) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change needed? I would expect a predicate-as-counter to be printed as a predicate-as-mask, but not the other way around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not needed, I've removed the normal predicate logic and tidied this if block up a bit.
; CHECK-NEXT: stack-id: scalable-vector, callee-saved-register: '' | ||
|
||
; EXPAND-LABEL: name: spills_fills_stack_id_pnr | ||
; EXPAND: STR_PXI $pn0, $sp, 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look entirely right because STR_PXI can't store $pn0, but it can store $p0.
I'm not sure if this is a problem though, because the encoding of $pn0
and $p0
is the same. Could you add the verifier to the RUN line, so that we know whether or not the MIR verifier accepts it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some changes to the spill/fill logic and this test is now outputting and asserting STR_PXI $p0, $sp, 7,implicit-def $pn0
@@ -3658,6 +3658,32 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &MBB, | |||
return; | |||
} | |||
|
|||
if (AArch64::PNRRegClass.contains(DestReg) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some tests for these COPYs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added llvm/test/CodeGen/AArch64/PPRtoPNRCopy.mir and llvm/test/CodeGen/AArch64/PNRtoPPRCopy.mir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not done a full review, but I've left the comments I have so far!
assert(Subtarget.hasSVEorSME() && "Unexpected SVE register."); | ||
if ((DestReg.id() - AArch64::P0) == (SrcReg.id() - AArch64::PN0)) | ||
return; | ||
SrcReg = (SrcReg - AArch64::PN0) + AArch64::P0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add brackets around the inverted case above too, i.e. PPR->PNR? I think it's more obvious with the brackets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
BuildMI(MBB, I, DL, get(AArch64::ORR_PPzPP), DestReg) | ||
.addReg(SrcReg) | ||
.addReg(SrcReg) | ||
.addReg(SrcReg, getKillRegState(KillSrc)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @sdesmalen-arm's comment about adding implicit def applies here too, i.e. an implicit def of the same PNR register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -3658,6 +3658,32 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &MBB, | |||
return; | |||
} | |||
|
|||
if (AArch64::PNRRegClass.contains(DestReg) && | |||
AArch64::PPRRegClass.contains(SrcReg)) { | |||
assert(Subtarget.hasSVEorSME() && "Unexpected SVE register."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed lower down you assert with
assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) &&
"Unexpected register load without SVE2p1 or SME2");
and I think it makes sense to assert that here for the COPY instructions too, since predicate-as-counter registers were introduced in SVE2.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@llvm/pr-subscribers-mc @llvm/pr-subscribers-backend-aarch64 ChangesThis patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand. --Patch is 78.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65306.diff 20 Files Affected:
<pre>
let Namespace = "AArch64" in { //===----- END: v8.7a accelerator extension register operands -------------===// +// SVE predicate-as-counter registers
// SVE predicate registers
// The part of SVE registers that don't overlap Neon registers. -class PPRRegOp <string Suffix, AsmOperandClass C, ElementSizeEnum Size,
class ZPRRegOp <string Suffix, AsmOperandClass C, ElementSizeEnum Size, @@ -891,7 +911,7 @@ class ZPRRegOp <string Suffix, AsmOperandClass C, ElementSizeEnum Size,
-def PPRAsmOpAny : PPRAsmOperand<"PredicateAny", "PPR", 0>;
|
ef2c776
to
0126977
Compare
bb.0 (%ir-block.0): | ||
; CHECK-LABEL: name: ptrue_d | ||
; CHECK: renamable $pn8 = PTRUE_C_D | ||
; CHECK-NEXT: $p0 = ORR_PPzPP $p8, $p8, killed $p8, implicit-def $pn8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, implicit-def $pn8
isn't right, because it's not defined. $p0
is defined and you could add $pn0
as an implicit-def, but not $pn8
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the implicit def to be the destination register instead of the source now.
@@ -0,0 +1,58 @@ | |||
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected PPRtoPNRCopy.mir to be almost the same as PNRtoPPRCopy.mir, with the PN and P swapped. Was there a reason that you couldn't construct such a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic was they're both from organic examples. For example in sve2p1-intrinsics-predicate-as-counter.ll
the pext tests will generate a PPR->PNR copy while the ptrue tests in that file will generate a PNR->PPR copy. If you'd rather I combine them into a single file and make them near carbon copies of each other that's fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you'd rather I combine them into a single file and make them near carbon copies of each other that's fine by me
That would have my preference, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've combined and simplified them into PNRtoPPRCopy.mir now.
@@ -80,8 +80,8 @@ body: | | |||
; CHECK-NEXT: stack-id: scalable-vector, callee-saved-register: '' | |||
|
|||
; EXPAND-LABEL: name: spills_fills_stack_id_pnr | |||
; EXPAND: STR_PXI $pn0, $sp, 7 | |||
; EXPAND: $pn0 = LDR_PXI $sp, 7 | |||
; EXPAND: STR_PXI $p0, $sp, 7, implicit-def $pn0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also not implicitly define $pn0
, because no registers are defined by a store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the implicit def from store spills.
@@ -4156,6 +4170,10 @@ void AArch64InstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB, | |||
} else if (AArch64::PNRRegClass.hasSubClassEq(RC)) { | |||
assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) && | |||
"Unexpected register load without SVE2p1 or SME2"); | |||
if (DestReg != AArch64::SP) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(same as previous comment: Why are you testing for AArch64::SP
here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw it done elsewhere in this function and wanted to be conservative, but yes, it is pointless. Removed.
@@ -3993,6 +4000,10 @@ void AArch64InstrInfo::storeRegToStackSlot(MachineBasicBlock &MBB, | |||
} else if (AArch64::PNRRegClass.hasSubClassEq(RC)) { | |||
assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) && | |||
"Unexpected register store without SVE2p1 or SME2"); | |||
if (SrcReg != AArch64::SP) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you testing for AArch64::SP here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw it done elsewhere in this function and wanted to be conservative, but yes, it is pointless. Removed.
bb.0 (%ir-block.0): | ||
; CHECK-LABEL: name: ptrue_d | ||
bb.0: | ||
; CHECK-LABEL: name: pnr_to_ppr | ||
; CHECK: renamable $pn8 = PTRUE_C_D | ||
; CHECK-NEXT: $p0 = ORR_PPzPP $p8, $p8, killed $p8, implicit-def $pn0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it do an implicit-def
of $pn0
here? It's copying $pn8
-> $p0
.
I would expect there only to be an implicit-def
for a $pn*
register if the destination of the COPY is a predicate-as-counter register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@david-arm Would you mind chipping in here since you suggested this change here? #65306 (comment) I'm not well versed with implict defs in MIR so I'm not sure if this is necessary or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a chat with @david-arm about this, who was concerned that the register allocator might not 'see' that $pn0
gets clobbered when $p0
gets written. However, with $pn0
being a subregister from $p0
that's not likely to be an issue, as the register allocator will know that writing $p0
also writes $pn0
.
However, we concluded there probably is no harm in keeping the implicit-def either, so I'm happy to stick with what you've got now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand.
7197fe0
to
56d2802
Compare
This patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand.