Skip to content

[AArch64] Add -mlr-for-calls-only to replace the now removed -ffixed-x30 flag. #98073

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions clang/include/clang/Driver/Options.td
Original file line number Diff line number Diff line change
Expand Up @@ -4917,6 +4917,9 @@ foreach i = {1-31} in
def ffixed_x#i : Flag<["-"], "ffixed-x"#i>, Group<m_Group>,
HelpText<"Reserve the x"#i#" register (AArch64/RISC-V only)">;

def mlr_for_calls_only : Flag<["-"], "mlr-for-calls-only">, Group<m_aarch64_Features_Group>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name seems okay.

HelpText<"Do not allocate the LR register for general purpose usage, only for calls. (AArch64 only)">;

foreach i = {8-15,18} in
def fcall_saved_x#i : Flag<["-"], "fcall-saved-x"#i>, Group<m_aarch64_Features_Group>,
HelpText<"Make the x"#i#" register call-saved (AArch64 only)">;
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/Driver/aarch64-mlr-for-calls-only.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

// RUN: %clang --target=aarch64-none-gnu -mlr-for-calls-only -### %s 2> %t
// RUN: FileCheck --check-prefix=CHECK < %t %s
// CHECK: "-target-feature" "+reserve-lr-for-ra"
3 changes: 3 additions & 0 deletions clang/lib/Driver/ToolChains/Arch/AArch64.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -398,6 +398,9 @@ void aarch64::getAArch64TargetFeatures(const Driver &D,
if (Args.hasArg(options::OPT_ffixed_x28))
Features.push_back("+reserve-x28");

if (Args.hasArg(options::OPT_mlr_for_calls_only))
Features.push_back("+reserve-lr-for-ra");

if (Args.hasArg(options::OPT_fcall_saved_x8))
Features.push_back("+call-saved-x8");

Expand Down
4 changes: 4 additions & 0 deletions llvm/lib/Target/AArch64/AArch64Features.td
Original file line number Diff line number Diff line change
Expand Up @@ -571,6 +571,10 @@ foreach i = {1-7,9-15,18,20-28} in
"Reserve X"#i#", making it unavailable "
"as a GPR">;

def FeatureReserveLRForRA : SubtargetFeature<"reserve-lr-for-ra",
"ReserveLRForRA", "true",
"Reserve LR for call use only">;

foreach i = {8-15,18} in
def FeatureCallSavedX#i : SubtargetFeature<"call-saved-x"#i,
"CustomCallSavedXRegs["#i#"]", "true", "Make X"#i#" callee saved.">;
Expand Down
7 changes: 5 additions & 2 deletions llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3267,10 +3267,13 @@ bool AArch64FrameLowering::spillCalleeSavedRegisters(
InsertSEH(MIB, TII, MachineInstr::FrameSetup);
} else { // The code when the pair of ZReg is not present
MachineInstrBuilder MIB = BuildMI(MBB, MI, DL, TII.get(StrOpc));
if (!MRI.isReserved(Reg1))
const AArch64RegisterInfo *RegInfo =
static_cast<const AArch64RegisterInfo *>(
MF.getSubtarget().getRegisterInfo());
if (!RegInfo->isStrictlyReservedReg(MF, Reg1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident this does what you want: even if you get the liveness right inside FrameLowering, target-independent code like LivePhysRegs will ignore registers marked isReserved(). So I'm not sure any liveness computations based on a reserved LR can be trusted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not, but I'm not sure what other options we have. It's worked for so far except for this case with the outliner.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could unreserve LR after register allocation, I guess? That would have roughly the right semantics.

Or we could make the outliner use some other method to try to figure out the "liveness" of LR if it's a reserved register. I mean, if LR is reserved, the only way it can be live is if the prologue doesn't save/restore it (in which case it's live across the whole function).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could unreserve LR after register allocation, I guess? That would have roughly the right semantics.

Having getReservedRegs(MF) return different things depending on where it's running seems worse than the bug itself.

With shrink-wrapping the prolog can be in successor blocks, so LR can be live on some CFG paths and not others.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having getReservedRegs(MF) return different things depending on where it's running seems worse than the bug itself.

I think this can happen anyway? For example, with the frame/base pointer.

With shrink-wrapping the prolog can be in successor blocks, so LR can be live on some CFG paths and not others.

Oh, right.

Other possibilities:

  • We could mess with the allocation orders so LR isn't allocated, despite being an allocatable register. This has basically all the properties we want... but we don't have the necessary infrastructure to make it straightforward.
  • We could introduce a distinction between "reserved for regalloc" and "tracked for liveness". So we'd have registers where we track liveness, but the register allocator doesn't allocate them.

Maybe worth bringing up on Discourse to see if there are other opinions.

(I'm not trying to make things complicated here, but I really don't want to take shortcuts when it comes to core MIR datastructures.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can happen anyway? For example, with the frame/base pointer.

Indeed :(

We could mess with the allocation orders so LR isn't allocated, despite being an allocatable register. This has basically all the properties we want... but we don't have the necessary infrastructure to make it straightforward.

Yeah, I thought about something along these lines. Perhaps if we could dynamically alter regclasses.

I think I'll give the changing reservedness approach a go here since there's already some precedent for it.

MBB.addLiveIn(Reg1);
if (RPI.isPaired()) {
if (!MRI.isReserved(Reg2))
if (!RegInfo->isStrictlyReservedReg(MF, Reg2))
MBB.addLiveIn(Reg2);
MIB.addReg(Reg2, getPrologueDeath(MF, Reg2));
MIB.addMemOperand(MF.getMachineMemOperand(
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -501,6 +501,9 @@ AArch64RegisterInfo::getReservedRegs(const MachineFunction &MF) const {
markSuperRegs(Reserved, AArch64::GPR32commonRegClass.getRegister(i));
}

if (MF.getSubtarget<AArch64Subtarget>().isLRReservedForRA())
markSuperRegs(Reserved, AArch64::LR);

assert(checkAllSuperRegsMarked(Reserved));
return Reserved;
}
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/Target/AArch64/AArch64Subtarget.h
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ class AArch64Subtarget final : public AArch64GenSubtargetInfo {
AllReservedX |= ReserveXRegisterForRA;
return AllReservedX.count();
}
bool isLRReservedForRA() const { return ReserveLRForRA; }
bool isXRegCustomCalleeSaved(size_t i) const {
return CustomCallSavedXRegs[i];
}
Expand Down
7 changes: 6 additions & 1 deletion llvm/test/CodeGen/AArch64/arm64-platform-reg.ll
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
; RUN: llc -mtriple=arm64-linux-gnu -mattr=+reserve-x26 -o - %s | FileCheck %s --check-prefixes=CHECK-RESERVE,CHECK-RESERVE-X26
; RUN: llc -mtriple=arm64-linux-gnu -mattr=+reserve-x27 -o - %s | FileCheck %s --check-prefixes=CHECK-RESERVE,CHECK-RESERVE-X27
; RUN: llc -mtriple=arm64-linux-gnu -mattr=+reserve-x28 -o - %s | FileCheck %s --check-prefixes=CHECK-RESERVE,CHECK-RESERVE-X28
; RUN: llc -mtriple=arm64-linux-gnu -mattr=+reserve-lr-for-ra -o - %s | FileCheck %s --check-prefixes=CHECK-RESERVE,CHECK-RESERVE-LR-RA

; Test multiple of reserve-x# options together.
; RUN: llc -mtriple=arm64-linux-gnu \
Expand Down Expand Up @@ -72,6 +73,7 @@
; RUN: -mattr=+reserve-x26 \
; RUN: -mattr=+reserve-x27 \
; RUN: -mattr=+reserve-x28 \
; RUN: -mattr=+reserve-lr-for-ra \
; RUN: -reserve-regs-for-regalloc=X8,X16,X17,X19 \
; RUN: -o - %s | FileCheck %s \
; RUN: --check-prefix=CHECK-RESERVE \
Expand Down Expand Up @@ -102,7 +104,8 @@
; RUN: --check-prefix=CHECK-RESERVE-X25 \
; RUN: --check-prefix=CHECK-RESERVE-X26 \
; RUN: --check-prefix=CHECK-RESERVE-X27 \
; RUN: --check-prefix=CHECK-RESERVE-X28
; RUN: --check-prefix=CHECK-RESERVE-X28 \
; RUN: --check-prefix=CHECK-RESERVE-LR-RA

; x18 is reserved as a platform register on Darwin but not on other
; systems. Create loads of register pressure and make sure this is respected.
Expand Down Expand Up @@ -149,6 +152,7 @@ define void @keep_live() {
; CHECK-RESERVE-X26-NOT: ldr x26
; CHECK-RESERVE-X27-NOT: ldr x27
; CHECK-RESERVE-X28-NOT: ldr x28
; CHECK-RESERVE-LR-RA-NOT: ldr x30
; CHECK-RESERVE: Spill
; CHECK-RESERVE-NOT: ldr fp
; CHECK-RESERVE-X1-NOT: ldr x1,
Expand Down Expand Up @@ -178,6 +182,7 @@ define void @keep_live() {
; CHECK-RESERVE-X26-NOT: ldr x26
; CHECK-RESERVE-X27-NOT: ldr x27
; CHECK-RESERVE-X28-NOT: ldr x28
; CHECK-RESERVE-LR-RA-NOT: ldr x30
; CHECK-RESERVE: ret
ret void
}
110 changes: 110 additions & 0 deletions llvm/test/CodeGen/AArch64/lr-reserved-for-ra-live-in.mir
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
# RUN: llc -run-pass=prologepilog -o - %s | FileCheck %s
--- |
target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "arm64e-apple-ios18.4.0"

declare void @spam()

define i32 @check_lr_live(ptr %arg) #0 {
bb:
%icmp = icmp eq ptr %arg, null
%or = or i1 %icmp, false
br i1 %or, label %bb2, label %bb1

bb1: ; preds = %bb
call void @spam()
br label %bb2

bb2: ; preds = %bb1, %bb
%phi = phi i32 [ -536870206, %bb ], [ 0, %bb1 ]
ret i32 %phi
}

attributes #0 = { nounwind "target-features"="+fp-armv8,+fullfp16,+lse,+neon,+ras,+rcpc,+rdm,+reserve-lr-for-ra" }

...
# This test checks that the live-in flags for LR are correctly set when LR is reserved, but live for a call.
---
name: check_lr_live
alignment: 4
tracksRegLiveness: true
tracksDebugUserValues: true
liveins:
- { reg: '$x0' }
frameInfo:
maxAlignment: 1
adjustsStack: true
hasCalls: true
maxCallFrameSize: 0
savePoint: '%bb.2'
restorePoint: '%bb.2'
machineFunctionInfo: {}
body: |
; CHECK-LABEL: name: check_lr_live
; CHECK: bb.0.bb:
; CHECK-NEXT: successors: %bb.1(0x20000000), %bb.2(0x60000000)
; CHECK-NEXT: liveins: $x0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: renamable $x8 = COPY $x0
; CHECK-NEXT: renamable $w0 = MOVi32imm -536870206
; CHECK-NEXT: CBNZX killed renamable $x8, %bb.2
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.1:
; CHECK-NEXT: successors: %bb.4(0x80000000)
; CHECK-NEXT: liveins: $w0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: B %bb.4
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.2.bb:
; CHECK-NEXT: successors: %bb.4(0x2aaaaaab), %bb.3(0x55555555)
; CHECK-NEXT: liveins: $w0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: CBNZW $wzr, %bb.4
; CHECK-NEXT: B %bb.3
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.3.bb1:
; CHECK-NEXT: successors: %bb.4(0x80000000)
; CHECK-NEXT: liveins: $lr
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: early-clobber $sp = frame-setup STPXpre killed $fp, killed $lr, $sp, -2 :: (store (s64) into %stack.1), (store (s64) into %stack.0)
; CHECK-NEXT: BL @spam, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
; CHECK-NEXT: renamable $w0 = COPY $wzr
; CHECK-NEXT: early-clobber $sp, $fp, $lr = frame-destroy LDPXpost $sp, 2 :: (load (s64) from %stack.1), (load (s64) from %stack.0)
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: bb.4.bb2:
; CHECK-NEXT: liveins: $w0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: RET_ReallyLR implicit $w0
bb.0.bb:
successors: %bb.4(0x20000000), %bb.1(0x60000000)
liveins: $x0

renamable $x8 = COPY $x0
renamable $w0 = MOVi32imm -536870206
CBNZX killed renamable $x8, %bb.1

bb.4:
liveins: $w0

B %bb.3

bb.1.bb:
successors: %bb.3(0x2aaaaaab), %bb.2(0x55555555)
liveins: $w0

CBNZW $wzr, %bb.3
B %bb.2

bb.2.bb1:
ADJCALLSTACKDOWN 0, 0, implicit-def dead $sp, implicit $sp
BL @spam, csr_darwin_aarch64_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
ADJCALLSTACKUP 0, 0, implicit-def dead $sp, implicit $sp
renamable $w0 = COPY $wzr

bb.3.bb2:
liveins: $w0

RET_ReallyLR implicit $w0

...
Loading