Skip to content

[AArch64] Separate PNR into its own Register Class #65306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 21, 2023

Conversation

MDevereau
Copy link
Contributor

This patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand.

@MDevereau MDevereau requested a review from a team as a code owner September 5, 2023 08:23
@MDevereau MDevereau changed the title [AArch64] Separate PNR into it's own Register Class [AArch64] Separate PNR into its own Register Class Sep 5, 2023
if ((DestReg.id() - AArch64::PN0) == (SrcReg.id() - AArch64::P0))
return;
DestReg = DestReg - AArch64::PN0 + AArch64::P0;
BuildMI(MBB, I, DL, get(AArch64::ORR_PPzPP), DestReg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also add an implicitDef of the corresponding predicate-as-counter register?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should, I've added it

def PPR32 : PPRRegOp<"s", PPRAsmOp32, ElementSizeS, PPR>;
def PPR64 : PPRRegOp<"d", PPRAsmOp64, ElementSizeD, PPR>;

def PPRAsmOpAny : PPRAsmOperand<"PredicateAny", "PPR", 0>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is for line 921: def PPR_p8to15 : PPRClass<8, 15>;

Is this register class still used somewhere? I thought we added this for instructions that take predicate-as-counter registers (which are passed in pn8-pn15).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of, It was used in c52d950 because there was no alternative PNR class to use at the time. It's probably not very useful now, unless you still want to be able to constrain the upper end of normal predicate when writing asm? Maybe its useful with some of the SVE2p1/SME2 strided/contiguous load instructions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if it's not used at the moment, please remove it from this patch. If it's needed in the future we can always add it later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, just ignore my comment above. The code is already in and used in getPredicateRegisterClass for inline-asm constraints. I'm not sure if anyone will ever use this for mask-predicates, but I guess there's no need to remove that functionality.

@@ -1348,9 +1348,14 @@ void AArch64InstPrinter::printPredicateAsCounter(const MCInst *MI,
const MCSubtargetInfo &STI,
raw_ostream &O) {
unsigned Reg = MI->getOperand(OpNum).getReg();
O << "pn";
if (Reg >= AArch64::P0 && Reg <= AArch64::P15)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change needed? I would expect a predicate-as-counter to be printed as a predicate-as-mask, but not the other way around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's not needed, I've removed the normal predicate logic and tidied this if block up a bit.

; CHECK-NEXT: stack-id: scalable-vector, callee-saved-register: ''

; EXPAND-LABEL: name: spills_fills_stack_id_pnr
; EXPAND: STR_PXI $pn0, $sp, 7
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look entirely right because STR_PXI can't store $pn0, but it can store $p0.

I'm not sure if this is a problem though, because the encoding of $pn0 and $p0 is the same. Could you add the verifier to the RUN line, so that we know whether or not the MIR verifier accepts it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some changes to the spill/fill logic and this test is now outputting and asserting STR_PXI $p0, $sp, 7,implicit-def $pn0

@@ -3658,6 +3658,32 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &MBB,
return;
}

if (AArch64::PNRRegClass.contains(DestReg) &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some tests for these COPYs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added llvm/test/CodeGen/AArch64/PPRtoPNRCopy.mir and llvm/test/CodeGen/AArch64/PNRtoPPRCopy.mir

Copy link
Contributor

@david-arm david-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not done a full review, but I've left the comments I have so far!

assert(Subtarget.hasSVEorSME() && "Unexpected SVE register.");
if ((DestReg.id() - AArch64::P0) == (SrcReg.id() - AArch64::PN0))
return;
SrcReg = (SrcReg - AArch64::PN0) + AArch64::P0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add brackets around the inverted case above too, i.e. PPR->PNR? I think it's more obvious with the brackets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

BuildMI(MBB, I, DL, get(AArch64::ORR_PPzPP), DestReg)
.addReg(SrcReg)
.addReg(SrcReg)
.addReg(SrcReg, getKillRegState(KillSrc));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @sdesmalen-arm's comment about adding implicit def applies here too, i.e. an implicit def of the same PNR register.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -3658,6 +3658,32 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &MBB,
return;
}

if (AArch64::PNRRegClass.contains(DestReg) &&
AArch64::PPRRegClass.contains(SrcReg)) {
assert(Subtarget.hasSVEorSME() && "Unexpected SVE register.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed lower down you assert with

assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) &&
"Unexpected register load without SVE2p1 or SME2");

and I think it makes sense to assert that here for the COPY instructions too, since predicate-as-counter registers were introduced in SVE2.1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@llvmbot
Copy link
Member

llvmbot commented Sep 13, 2023

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-aarch64

Changes This patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand. --

Patch is 78.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65306.diff

20 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+2)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+36)
  • (modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.td (+106-66)
  • (modified) llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td (+12-12)
  • (modified) llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp (+27-16)
  • (modified) llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp (+18-3)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp (+7-2)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.h (-1)
  • (modified) llvm/lib/Target/AArch64/MCTargetDesc/AArch64MCCodeEmitter.cpp (+11)
  • (modified) llvm/lib/Target/AArch64/SMEInstrFormats.td (+8-8)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-inline-asm.ll (+11-12)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-unwind-inline-asm.ll (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/GlobalISel/regbank-inlineasm.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/aarch64-sve-asm.ll (+3)
  • (modified) llvm/test/CodeGen/AArch64/callbr-asm-outputs-indirect-isel.ll (+13-13)
  • (modified) llvm/test/CodeGen/AArch64/emit_fneg_with_non_register_operand.mir (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/peephole-insvigpr.mir (+1-2)
  • (modified) llvm/test/CodeGen/AArch64/preserve.ll (+2-2)
  • (modified) llvm/test/CodeGen/AArch64/spillfill-sve.mir (+45)
  • (modified) llvm/test/MC/AArch64/SVE/cntp-diagnostics.s (+1-1)

<pre>
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index aa7efc65949ae42..f4d3a85f34c88da 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -974,6 +974,8 @@ bool AArch64AsmPrinter::PrintAsmOperand(const MachineInstr *MI, unsigned OpNum,
RegClass = &amp;AArch64::ZPRRegClass;
} else if (AArch64::PPRRegClass.contains(Reg)) {
RegClass = &amp;AArch64::PPRRegClass;

  • } else if (AArch64::PNRRegClass.contains(Reg)) {

  •  RegClass = &amp;amp;AArch64::PNRRegClass;
    

    } else {
    RegClass = &amp;AArch64::FPR128RegClass;
    AltName = AArch64::vreg;
    diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
    index 3a18f37e31185d2..91d5dffaba11c9a 100644
    --- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
    +++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
    @@ -3668,6 +3668,32 @@ void AArch64InstrInfo::copyPhysReg(MachineBasicBlock &amp;MBB,
    return;
    }

  • if (AArch64::PNRRegClass.contains(DestReg) &amp;&amp;

  •  AArch64::PPRRegClass.contains(SrcReg)) {
    
  • assert(Subtarget.hasSVEorSME() &amp;&amp; &quot;Unexpected SVE register.&quot;);

  • if ((DestReg.id() - AArch64::PN0) == (SrcReg.id() - AArch64::P0))

  •  return;
    
  • DestReg = DestReg - AArch64::PN0 + AArch64::P0;

  • BuildMI(MBB, I, DL, get(AArch64::ORR_PPzPP), DestReg)

  •    .addReg(SrcReg)
    
  •    .addReg(SrcReg)
    
  •    .addReg(SrcReg, getKillRegState(KillSrc));
    
  • return;

  • }

  • if (AArch64::PPRRegClass.contains(DestReg) &amp;&amp;

  •  AArch64::PNRRegClass.contains(SrcReg)) {
    
  • assert(Subtarget.hasSVEorSME() &amp;&amp; &quot;Unexpected SVE register.&quot;);

  • if ((DestReg.id() - AArch64::P0) == (SrcReg.id() - AArch64::PN0))

  •  return;
    
  • SrcReg = (SrcReg - AArch64::PN0) + AArch64::P0;

  • BuildMI(MBB, I, DL, get(AArch64::ORR_PPzPP), DestReg)

  •    .addReg(SrcReg)
    
  •    .addReg(SrcReg)
    
  •    .addReg(SrcReg, getKillRegState(KillSrc));
    
  • return;

  • }

  • // Copy a Z register by ORRing with itself.
    if (AArch64::ZPRRegClass.contains(DestReg) &amp;&amp;
    AArch64::ZPRRegClass.contains(SrcReg)) {
    @@ -3975,6 +4001,11 @@ void AArch64InstrInfo::storeRegToStackSlot(MachineBasicBlock &amp;MBB,
    &quot;Unexpected register store without SVE store instructions&quot;);
    Opc = AArch64::STR_PXI;
    StackID = TargetStackID::ScalableVector;

  • } else if (AArch64::PNRRegClass.hasSubClassEq(RC)) {

  •  assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) &amp;amp;&amp;amp;
    
  •         &amp;quot;Unexpected register store without SVE2p1 or SME2&amp;quot;);
    
  •  Opc = AArch64::STR_PXI;
    
  •  StackID = TargetStackID::ScalableVector;
    

    }
    break;
    case 4:
    @@ -4138,6 +4169,11 @@ void AArch64InstrInfo::loadRegFromStackSlot(MachineBasicBlock &amp;MBB,
    &quot;Unexpected register load without SVE load instructions&quot;);
    Opc = AArch64::LDR_PXI;
    StackID = TargetStackID::ScalableVector;

  • } else if (AArch64::PNRRegClass.hasSubClassEq(RC)) {

  •  assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) &amp;amp;&amp;amp;
    
  •         &amp;quot;Unexpected register load without SVE2p1 or SME2&amp;quot;);
    
  •  Opc = AArch64::LDR_PXI;
    
  •  StackID = TargetStackID::ScalableVector;
    

    }
    break;
    case 4:
    diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
    index 18fc8c77a0e44d9..3f74c74ad1d6d8d 100644
    --- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
    +++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
    @@ -55,6 +55,8 @@ let Namespace = &quot;AArch64&quot; in {
    def zasubd1 : SubRegIndex&lt;256&gt;; // (16 x 16)/8 bytes = 256 bits
    def zasubq0 : SubRegIndex&lt;128&gt;; // (16 x 16)/16 bytes = 128 bits
    def zasubq1 : SubRegIndex&lt;128&gt;; // (16 x 16)/16 bytes = 128 bits

  • def psub : SubRegIndex&lt;16&gt;;
    }

let Namespace = &quot;AArch64&quot; in {
@@ -767,23 +769,43 @@ def GPR64x8 : RegisterOperand&lt;GPR64x8Class, &quot;printGPR64x8&quot;&gt; {

//===----- END: v8.7a accelerator extension register operands -------------===//

+// SVE predicate-as-counter registers

  • def PN0 : AArch64Reg&lt;0, &quot;pn0&quot;&gt;, DwarfRegNum&lt;[48]&gt;;
  • def PN1 : AArch64Reg&lt;1, &quot;pn1&quot;&gt;, DwarfRegNum&lt;[49]&gt;;
  • def PN2 : AArch64Reg&lt;2, &quot;pn2&quot;&gt;, DwarfRegNum&lt;[50]&gt;;
  • def PN3 : AArch64Reg&lt;3, &quot;pn3&quot;&gt;, DwarfRegNum&lt;[51]&gt;;
  • def PN4 : AArch64Reg&lt;4, &quot;pn4&quot;&gt;, DwarfRegNum&lt;[52]&gt;;
  • def PN5 : AArch64Reg&lt;5, &quot;pn5&quot;&gt;, DwarfRegNum&lt;[53]&gt;;
  • def PN6 : AArch64Reg&lt;6, &quot;pn6&quot;&gt;, DwarfRegNum&lt;[54]&gt;;
  • def PN7 : AArch64Reg&lt;7, &quot;pn7&quot;&gt;, DwarfRegNum&lt;[55]&gt;;
  • def PN8 : AArch64Reg&lt;8, &quot;pn8&quot;&gt;, DwarfRegNum&lt;[56]&gt;;
  • def PN9 : AArch64Reg&lt;9, &quot;pn9&quot;&gt;, DwarfRegNum&lt;[57]&gt;;
  • def PN10 : AArch64Reg&lt;10, &quot;pn10&quot;&gt;, DwarfRegNum&lt;[58]&gt;;
  • def PN11 : AArch64Reg&lt;11, &quot;pn11&quot;&gt;, DwarfRegNum&lt;[59]&gt;;
  • def PN12 : AArch64Reg&lt;12, &quot;pn12&quot;&gt;, DwarfRegNum&lt;[60]&gt;;
  • def PN13 : AArch64Reg&lt;13, &quot;pn13&quot;&gt;, DwarfRegNum&lt;[61]&gt;;
  • def PN14 : AArch64Reg&lt;14, &quot;pn14&quot;&gt;, DwarfRegNum&lt;[62]&gt;;
  • def PN15 : AArch64Reg&lt;15, &quot;pn15&quot;&gt;, DwarfRegNum&lt;[63]&gt;;

// SVE predicate registers
-def P0 : AArch64Reg&lt;0, &quot;p0&quot;&gt;, DwarfRegNum&lt;[48]&gt;;
-def P1 : AArch64Reg&lt;1, &quot;p1&quot;&gt;, DwarfRegNum&lt;[49]&gt;;
-def P2 : AArch64Reg&lt;2, &quot;p2&quot;&gt;, DwarfRegNum&lt;[50]&gt;;
-def P3 : AArch64Reg&lt;3, &quot;p3&quot;&gt;, DwarfRegNum&lt;[51]&gt;;
-def P4 : AArch64Reg&lt;4, &quot;p4&quot;&gt;, DwarfRegNum&lt;[52]&gt;;
-def P5 : AArch64Reg&lt;5, &quot;p5&quot;&gt;, DwarfRegNum&lt;[53]&gt;;
-def P6 : AArch64Reg&lt;6, &quot;p6&quot;&gt;, DwarfRegNum&lt;[54]&gt;;
-def P7 : AArch64Reg&lt;7, &quot;p7&quot;&gt;, DwarfRegNum&lt;[55]&gt;;
-def P8 : AArch64Reg&lt;8, &quot;p8&quot;&gt;, DwarfRegNum&lt;[56]&gt;;
-def P9 : AArch64Reg&lt;9, &quot;p9&quot;&gt;, DwarfRegNum&lt;[57]&gt;;
-def P10 : AArch64Reg&lt;10, &quot;p10&quot;&gt;, DwarfRegNum&lt;[58]&gt;;
-def P11 : AArch64Reg&lt;11, &quot;p11&quot;&gt;, DwarfRegNum&lt;[59]&gt;;
-def P12 : AArch64Reg&lt;12, &quot;p12&quot;&gt;, DwarfRegNum&lt;[60]&gt;;
-def P13 : AArch64Reg&lt;13, &quot;p13&quot;&gt;, DwarfRegNum&lt;[61]&gt;;
-def P14 : AArch64Reg&lt;14, &quot;p14&quot;&gt;, DwarfRegNum&lt;[62]&gt;;
-def P15 : AArch64Reg&lt;15, &quot;p15&quot;&gt;, DwarfRegNum&lt;[63]&gt;;
+let SubRegIndices = [psub] in {

  • def P0 : AArch64Reg&lt;0, &quot;p0&quot;, [PN0]&gt;, DwarfRegAlias&lt;PN0&gt;;
  • def P1 : AArch64Reg&lt;1, &quot;p1&quot;, [PN1]&gt;, DwarfRegAlias&lt;PN1&gt;;
  • def P2 : AArch64Reg&lt;2, &quot;p2&quot;, [PN2]&gt;, DwarfRegAlias&lt;PN2&gt;;
  • def P3 : AArch64Reg&lt;3, &quot;p3&quot;, [PN3]&gt;, DwarfRegAlias&lt;PN3&gt;;
  • def P4 : AArch64Reg&lt;4, &quot;p4&quot;, [PN4]&gt;, DwarfRegAlias&lt;PN4&gt;;
  • def P5 : AArch64Reg&lt;5, &quot;p5&quot;, [PN5]&gt;, DwarfRegAlias&lt;PN5&gt;;
  • def P6 : AArch64Reg&lt;6, &quot;p6&quot;, [PN6]&gt;, DwarfRegAlias&lt;PN6&gt;;
  • def P7 : AArch64Reg&lt;7, &quot;p7&quot;, [PN7]&gt;, DwarfRegAlias&lt;PN7&gt;;
  • def P8 : AArch64Reg&lt;8, &quot;p8&quot;, [PN8]&gt;, DwarfRegAlias&lt;PN8&gt;;
  • def P9 : AArch64Reg&lt;9, &quot;p9&quot;, [PN9]&gt;, DwarfRegAlias&lt;PN9&gt;;
  • def P10 : AArch64Reg&lt;10, &quot;p10&quot;, [PN10]&gt;, DwarfRegAlias&lt;PN10&gt;;
  • def P11 : AArch64Reg&lt;11, &quot;p11&quot;, [PN11]&gt;, DwarfRegAlias&lt;PN11&gt;;
  • def P12 : AArch64Reg&lt;12, &quot;p12&quot;, [PN12]&gt;, DwarfRegAlias&lt;PN12&gt;;
  • def P13 : AArch64Reg&lt;13, &quot;p13&quot;, [PN13]&gt;, DwarfRegAlias&lt;PN13&gt;;
  • def P14 : AArch64Reg&lt;14, &quot;p14&quot;, [PN14]&gt;, DwarfRegAlias&lt;PN14&gt;;
  • def P15 : AArch64Reg&lt;15, &quot;p15&quot;, [PN15]&gt;, DwarfRegAlias&lt;PN15&gt;;
    +}

// The part of SVE registers that don&#x27;t overlap Neon registers.
// These are only used as part of clobber lists.
@@ -881,8 +903,6 @@ class SVERegOp &lt;string Suffix, AsmOperandClass C,
let ParserMatchClass = C;
}

-class PPRRegOp &lt;string Suffix, AsmOperandClass C, ElementSizeEnum Size,

  •            RegisterClass RC&amp;gt; : SVERegOp&amp;lt;Suffix, C, Size, RC&amp;gt; {}
    

class ZPRRegOp &lt;string Suffix, AsmOperandClass C, ElementSizeEnum Size,
RegisterClass RC&gt; : SVERegOp&lt;Suffix, C, Size, RC&gt; {}

@@ -891,7 +911,7 @@ class ZPRRegOp &lt;string Suffix, AsmOperandClass C, ElementSizeEnum Size,
// SVE predicate register classes.
class PPRClass&lt;int firstreg, int lastreg&gt; : RegisterClass&lt;
&quot;AArch64&quot;,

  •                              [ nxv16i1, nxv8i1, nxv4i1, nxv2i1, nxv1i1, aarch64svcount ], 16,
    
  •                              [ nxv16i1, nxv8i1, nxv4i1, nxv2i1, nxv1i1 ], 16,
                                 (sequence &amp;quot;P%u&amp;quot;, firstreg, lastreg)&amp;gt; {
    
    let Size = 16;
    }
    @@ -909,69 +929,89 @@ class PPRAsmOperand &lt;string name, string RegClass, int Width&gt;: AsmOperandClass {
    let ParserMethod = &quot;tryParseSVEPredicateVector&lt;RegKind::SVEPredicateVector&gt;&quot;;
    }

-def PPRAsmOpAny : PPRAsmOperand&lt;&quot;PredicateAny&quot;, &quot;PPR&quot;, 0&gt;;
-def PPRAsmOp8 : PPRAsmOperand&lt;&quot;PredicateB&quot;, &quot;PPR&quot;, 8&gt;;
-def PPRAsmOp16 : PPRAsmOperand&lt;&quot;PredicateH&quot;, &quot;PPR&quot;, 16&gt;;
-def PPRAsmOp32 : PPRAsmOperand&lt;&quot;PredicateS&quot;, &quot;PPR&quot;, 32&gt;;
-def PPRAsmOp64 : PPRAsmOperand&lt;&quot;PredicateD&quot;, &quot;PPR&quot;, 64&gt;;

-def PPRAny : PPRRegOp&lt;&quot;&quot;, PPRAsmOpAny, ElementSizeNone, PPR&gt;;
-def PPR8 : PPRRegOp&lt;&quot;b&quot;, PPRAsmOp8, ElementSizeB, PPR&gt;;
-def PPR16 : PPRRegOp&lt;&quot;h&quot;, PPRAsmOp16, ElementSizeH, PPR&gt;;
-def PPR32 : PPRRegOp&lt;&quot;s&quot;, PPRAsmOp32, ElementSizeS, PPR&gt;;
-def PPR64 : PPRRegOp&lt;&quot;d&quot;, PPRAsmOp64, ElementSizeD, PPR&gt;;

+def PPRAsmOpAny : PPRAsmOperand&lt;&quot;PredicateAny&quot;, &quot;PPR&quot;, 0&gt;;
+def PPRAsmOp8 : PPRAsmOperand&lt;&quot;PredicateB&quot;, &quot;PPR&quot;, 8&gt;;
+def PPRAsmOp16 : PPRAsmOperand&lt;&quot;PredicateH&quot;, &quot;PPR&quot;, 16&gt;;
+def PPRAsmOp32 : PPRAsmOperand&lt;&quot;PredicateS&quot;, &quot;PPR&quot;, 32&gt;;
+def PPRAsmOp64 : PPRAsmOperand&lt;&quot;PredicateD&quot;, &quot;PPR&quot;, 64&gt;;
def PPRAsmOp3bAny : PPRAsmOperand&lt;&quot;Predicate3bAny&quot;, &quot;PPR_3b&quot;, 0&gt;;

+class PPRRegOp &lt;string Suffix, AsmOperandClass C, ElementSizeEnum Size,

  •            RegisterClass RC&amp;gt; : SVERegOp&amp;lt;Suffix, C, Size, RC&amp;gt; {}
    

+def PPRAny : PPRRegOp&lt;&quot;&quot;, PPRAsmOpAny, ElementSizeNone, PPR&gt;;
+def PPR8 : PPRRegOp&lt;&quot;b&quot;, PPRAsmOp8, ElementSizeB, PPR&gt;;
+def PPR16 : PPRRegOp&lt;&quot;h&quot;, PPRAsmOp16, ElementSizeH, PPR&gt;;
+def PPR32 : PPRRegOp&lt;&quot;s&quot;, PPRAsmOp32, ElementSizeS, PPR&gt;;
+def PPR64 : PPRRegOp&lt;&quot;d&quot;, PPRAsmOp64, ElementSizeD, PPR&gt;;
def PPR3bAny : PPRRegOp&lt;&quot;&quot;, PPRAsmOp3bAny, ElementSizeNone, PPR_3b&gt;;

+class PNRClass&lt;int firstreg, int lastreg&gt; : RegisterClass&lt;

  •                              &amp;quot;AArch64&amp;quot;,
    
  •                              [ aarch64svcount ], 16,
    
  •                              (sequence &amp;quot;PN%u&amp;quot;, firstreg, lastreg)&amp;gt; {
    
  • let Size = 16;
    +}

+def PNR : PNRClass&lt;0, 15&gt;;
+def PNR_p8to15 : PNRClass&lt;8, 15&gt;;

// SVE predicate-as-counter operand
-class PNRAsmOperand&lt;string name, string RegClass, int Width&gt;

  • : PPRAsmOperand&lt;name, RegClass, Width&gt; {
    +class PNRAsmOperand&lt;string name, string RegClass, int Width&gt;: AsmOperandClass {
  • let Name = &quot;SVE&quot; # name # &quot;Reg&quot;;
    let PredicateMethod = &quot;isSVEPredicateAsCounterRegOfWidth&lt;&quot;
    # Width # &quot;, &quot; # &quot;AArch64::&quot;
    # RegClass # &quot;RegClassID&gt;&quot;;
    let DiagnosticType = &quot;InvalidSVE&quot; # name # &quot;Reg&quot;;
  • let RenderMethod = &quot;addRegOperands&quot;;
    let ParserMethod = &quot;tryParseSVEPredicateVector&lt;RegKind::SVEPredicateAsCounter&gt;&quot;;
    }

-class PNRRegOp&lt;string Suffix, AsmOperandClass C, int EltSize, RegisterClass RC&gt;

  • : PPRRegOp&lt;Suffix, C, ElementSizeNone, RC&gt; {
  • let PrintMethod = &quot;printPredicateAsCounter&lt;&quot; # EltSize # &quot;&gt;&quot;;
    +let RenderMethod = &quot;addPNRasPPRRegOperands&quot; in {
  • def PNRasPPROpAny : PNRAsmOperand&lt;&quot;PNRasPPRPredicateAny&quot;, &quot;PNR&quot;, 0&gt;;
  • def PNRasPPROp8 : PNRAsmOperand&lt;&quot;PNRasPPRPredicateB&quot;, &quot;PNR&quot;, 8&gt;;
    }

-def PNRAsmOpAny: PNRAsmOperand&lt;&quot;PNPredicateAny&quot;, &quot;PPR&quot;, 0&gt;;
-def PNRAsmOp8 : PNRAsmOperand&lt;&quot;PNPredicateB&quot;, &quot;PPR&quot;, 8&gt;;
-def PNRAsmOp16 : PNRAsmOperand&lt;&quot;PNPredicateH&quot;, &quot;PPR&quot;, 16&gt;;
-def PNRAsmOp32 : PNRAsmOperand&lt;&quot;PNPredicateS&quot;, &quot;PPR&quot;, 32&gt;;
-def PNRAsmOp64 : PNRAsmOperand&lt;&quot;PNPredicateD&quot;, &quot;PPR&quot;, 64&gt;;

-def PNRAny : PNRRegOp&lt;&quot;&quot;, PNRAsmOpAny, 0, PPR&gt;;
-def PNR8 : PNRRegOp&lt;&quot;b&quot;, PNRAsmOp8, 8, PPR&gt;;
-def PNR16 : PNRRegOp&lt;&quot;h&quot;, PNRAsmOp16, 16, PPR&gt;;
-def PNR32 : PNRRegOp&lt;&quot;s&quot;, PNRAsmOp32, 32, PPR&gt;;
-def PNR64 : PNRRegOp&lt;&quot;d&quot;, PNRAsmOp64, 64, PPR&gt;;

-class PNRP8to15RegOp&lt;string Suffix, AsmOperandClass C, int EltSize, RegisterClass RC&gt;

  • : PPRRegOp&lt;Suffix, C, ElementSizeNone, RC&gt; {
  • let PrintMethod = &quot;printPredicateAsCounter&lt;&quot; # EltSize # &quot;&gt;&quot;;
  • let EncoderMethod = &quot;EncodePPR_p8to15&quot;;
  • let DecoderMethod = &quot;DecodePPR_p8to15RegisterClass&quot;;
    -}

-def PNRAsmAny_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateAny_p8to15&quot;, &quot;PPR_p8to15&quot;, 0&gt;;
-def PNRAsmOp8_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateB_p8to15&quot;, &quot;PPR_p8to15&quot;, 8&gt;;
-def PNRAsmOp16_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateH_p8to15&quot;, &quot;PPR_p8to15&quot;, 16&gt;;
-def PNRAsmOp32_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateS_p8to15&quot;, &quot;PPR_p8to15&quot;, 32&gt;;
-def PNRAsmOp64_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateD_p8to15&quot;, &quot;PPR_p8to15&quot;, 64&gt;;

-def PNRAny_p8to15 : PNRP8to15RegOp&lt;&quot;&quot;, PNRAsmAny_p8to15, 0, PPR_p8to15&gt;;
-def PNR8_p8to15 : PNRP8to15RegOp&lt;&quot;b&quot;, PNRAsmOp8_p8to15, 8, PPR_p8to15&gt;;
-def PNR16_p8to15 : PNRP8to15RegOp&lt;&quot;h&quot;, PNRAsmOp16_p8to15, 16, PPR_p8to15&gt;;
-def PNR32_p8to15 : PNRP8to15RegOp&lt;&quot;s&quot;, PNRAsmOp32_p8to15, 32, PPR_p8to15&gt;;
-def PNR64_p8to15 : PNRP8to15RegOp&lt;&quot;d&quot;, PNRAsmOp64_p8to15, 64, PPR_p8to15&gt;;
+class PNRasPPRRegOp&lt;string Suffix, AsmOperandClass C, ElementSizeEnum Size,

  •            RegisterClass RC&amp;gt; : SVERegOp&amp;lt;Suffix, C, Size, RC&amp;gt; {}
    

+def PNRasPPRAny : PNRasPPRRegOp&lt;&quot;&quot;, PNRasPPROpAny, ElementSizeNone, PPR&gt;;
+def PNRasPPR8 : PNRasPPRRegOp&lt;&quot;b&quot;, PNRasPPROp8, ElementSizeB, PPR&gt;;
+
+def PNRAsmOpAny: PNRAsmOperand&lt;&quot;PNPredicateAny&quot;, &quot;PNR&quot;, 0&gt;;
+def PNRAsmOp8 : PNRAsmOperand&lt;&quot;PNPredicateB&quot;, &quot;PNR&quot;, 8&gt;;
+def PNRAsmOp16 : PNRAsmOperand&lt;&quot;PNPredicateH&quot;, &quot;PNR&quot;, 16&gt;;
+def PNRAsmOp32 : PNRAsmOperand&lt;&quot;PNPredicateS&quot;, &quot;PNR&quot;, 32&gt;;
+def PNRAsmOp64 : PNRAsmOperand&lt;&quot;PNPredicateD&quot;, &quot;PNR&quot;, 64&gt;;
+
+class PNRRegOp&lt;string Suffix, AsmOperandClass C, int Size, RegisterClass RC&gt;

  • : SVERegOp&lt;Suffix, C, ElementSizeNone, RC&gt; {
  • let PrintMethod = &quot;printPredicateAsCounter&lt;&quot; # Size # &quot;&gt;&quot;;
    +}
    +def PNRAny : PNRRegOp&lt;&quot;&quot;, PNRAsmOpAny, 0, PNR&gt;;
    +def PNR8 : PNRRegOp&lt;&quot;b&quot;, PNRAsmOp8, 8, PNR&gt;;
    +def PNR16 : PNRRegOp&lt;&quot;h&quot;, PNRAsmOp16, 16, PNR&gt;;
    +def PNR32 : PNRRegOp&lt;&quot;s&quot;, PNRAsmOp32, 32, PNR&gt;;
    +def PNR64 : PNRRegOp&lt;&quot;d&quot;, PNRAsmOp64, 64, PNR&gt;;

+def PNRAsmAny_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateAny_p8to15&quot;, &quot;PNR_p8to15&quot;, 0&gt;;
+def PNRAsmOp8_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateB_p8to15&quot;, &quot;PNR_p8to15&quot;, 8&gt;;
+def PNRAsmOp16_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateH_p8to15&quot;, &quot;PNR_p8to15&quot;, 16&gt;;
+def PNRAsmOp32_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateS_p8to15&quot;, &quot;PNR_p8to15&quot;, 32&gt;;
+def PNRAsmOp64_p8to15 : PNRAsmOperand&lt;&quot;PNPredicateD_p8to15&quot;, &quot;PNR_p8to15&quot;, 64&gt;;
+
+class PNRP8to15RegOp&lt;string Suffix, AsmOperandClass C, int Width, RegisterClass RC&gt;

  • : SVERegOp&lt;Suffix, C, ElementSizeNone, RC&gt; {
  • let PrintMethod = &quot;printPredicateAsCounter&lt;&quot; # Width # &quot;&gt;&quot;;
  • let EncoderMethod = &quot;EncodePNR_p8to15&quot;;
  • let DecoderMethod = &quot;DecodePNR_p8to15RegisterClass&quot;;
    +}

+def PNRAny_p8to15 : PNRP8to15RegOp&lt;&quot;&quot;, PNRAsmAny_p8to15, 0, PNR_p8to15&gt;;
+def PNR8_p8to15 : PNRP8to15RegOp&lt;&quot;b&quot;, PNRAsmOp8_p8to15, 8, PNR_p8to15&gt;;
+def PNR16_p8to15 : PNRP8to15RegOp&lt;&quot;h&quot;, PNRAsmOp16_p8to15, 16, PNR_p8to15&gt;;
+def PNR32_p8to15 : PNRP8to15RegOp&lt;&quot;s&quot;, PNRAsmOp32_p8to15, 32, PNR_p8to15&gt;;
+def PNR64_p8to15 : PNRP8to15RegOp&lt;&quot;d&quot;, PNRAsmOp64_p8to15, 64, PNR_p8to15&gt;;

let Namespace = &quot;AArch64&quot; in {
def psub0 : SubRegIndex&lt;16, -1&gt;;
diff --git a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
index 9c2b270faef125f..a7a64c6b20d84d3 100644
--- a/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
@@ -2544,8 +2544,8 @@ let Predicates = [HasSVEorSME] in {
def : Pat&lt;(nxv4f32 (bitconvert (nxv8bf16 ZPR:$src))), (nxv4f32 ZPR:$src)&gt;;
def : Pat&lt;(nxv2f64 (bitconvert (nxv8bf16 ZPR:$src))), (nxv2f64 ZPR:$src)&gt;;

  • def : Pat&lt;(nxv16i1 (bitconvert (aarch64svcount PPR:$src))), (nxv16i1 PPR:$src)&gt;;
  • def : Pat&lt;(aarch64svcount (bitconvert (nxv16i1 PPR:$src))), (aarch64svcount PPR:$src)&gt;;
  • def : Pat&lt;(nxv16i1 (bitconvert (aarch64svcount PNR:$src))), (nxv16i1 PPR:$src)&gt;;
  • def : Pat&l...

bb.0 (%ir-block.0):
; CHECK-LABEL: name: ptrue_d
; CHECK: renamable $pn8 = PTRUE_C_D
; CHECK-NEXT: $p0 = ORR_PPzPP $p8, $p8, killed $p8, implicit-def $pn8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

, implicit-def $pn8 isn't right, because it's not defined. $p0 is defined and you could add $pn0 as an implicit-def, but not $pn8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the implicit def to be the destination register instead of the source now.

@@ -0,0 +1,58 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected PPRtoPNRCopy.mir to be almost the same as PNRtoPPRCopy.mir, with the PN and P swapped. Was there a reason that you couldn't construct such a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic was they're both from organic examples. For example in sve2p1-intrinsics-predicate-as-counter.ll the pext tests will generate a PPR->PNR copy while the ptrue tests in that file will generate a PNR->PPR copy. If you'd rather I combine them into a single file and make them near carbon copies of each other that's fine by me.

Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd rather I combine them into a single file and make them near carbon copies of each other that's fine by me

That would have my preference, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've combined and simplified them into PNRtoPPRCopy.mir now.

@@ -80,8 +80,8 @@ body: |
; CHECK-NEXT: stack-id: scalable-vector, callee-saved-register: ''

; EXPAND-LABEL: name: spills_fills_stack_id_pnr
; EXPAND: STR_PXI $pn0, $sp, 7
; EXPAND: $pn0 = LDR_PXI $sp, 7
; EXPAND: STR_PXI $p0, $sp, 7, implicit-def $pn0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also not implicitly define $pn0, because no registers are defined by a store.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the implicit def from store spills.

@@ -4156,6 +4170,10 @@ void AArch64InstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
} else if (AArch64::PNRRegClass.hasSubClassEq(RC)) {
assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) &&
"Unexpected register load without SVE2p1 or SME2");
if (DestReg != AArch64::SP) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same as previous comment: Why are you testing for AArch64::SP here?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw it done elsewhere in this function and wanted to be conservative, but yes, it is pointless. Removed.

@@ -3993,6 +4000,10 @@ void AArch64InstrInfo::storeRegToStackSlot(MachineBasicBlock &MBB,
} else if (AArch64::PNRRegClass.hasSubClassEq(RC)) {
assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) &&
"Unexpected register store without SVE2p1 or SME2");
if (SrcReg != AArch64::SP) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you testing for AArch64::SP here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw it done elsewhere in this function and wanted to be conservative, but yes, it is pointless. Removed.

bb.0 (%ir-block.0):
; CHECK-LABEL: name: ptrue_d
bb.0:
; CHECK-LABEL: name: pnr_to_ppr
; CHECK: renamable $pn8 = PTRUE_C_D
; CHECK-NEXT: $p0 = ORR_PPzPP $p8, $p8, killed $p8, implicit-def $pn0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it do an implicit-def of $pn0 here? It's copying $pn8 -> $p0.
I would expect there only to be an implicit-def for a $pn* register if the destination of the COPY is a predicate-as-counter register.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@david-arm Would you mind chipping in here since you suggested this change here? #65306 (comment) I'm not well versed with implict defs in MIR so I'm not sure if this is necessary or not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a chat with @david-arm about this, who was concerned that the register allocator might not 'see' that $pn0 gets clobbered when $p0 gets written. However, with $pn0 being a subregister from $p0 that's not likely to be an issue, as the register allocator will know that writing $p0 also writes $pn0.

However, we concluded there probably is no harm in keeping the implicit-def either, so I'm happy to stick with what you've got now.

Copy link
Collaborator

@sdesmalen-arm sdesmalen-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

This patch separates PNR registers into their own register
class instead of sharing a register class with PPR
registers. This primarily allows us to return more accurate
register classes when applying assembly constraints, but
also more protection from supplying an incorrect predicate
type to an invalid register operand.
@MDevereau MDevereau merged commit b967f3a into llvm:main Sep 21, 2023
@MDevereau MDevereau deleted the pnr-separation branch September 21, 2023 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants