Skip to content

[AArch64] Add check that prologue insertion doesn't clobber live regs. #71826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 22, 2023

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Nov 9, 2023

This patch extends AArch64FrameLowering::emitProglogue to check if the inserted prologue clobbers live registers.

It updates llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir with an extra load to make x9 live before the store, preserving the original test.

It uses the original llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir as llvm/test/CodeGen/AArch64/emit-prologue-clobber-verification.mir, because there x9 is marked as live on entry, but used as scratch reg as it is not callee saved.

The new assertion catches a mis-compile in store-swift-async-context-clobber-live-reg.ll on https://github.com/apple/llvm-project/tree/next

Copy link

github-actions bot commented Nov 9, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

This patch extends AArch64FrameLowering::emitProglogue to check if the
inserted prologue clobbers live registers.

At the moment, llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir
is failing because it has a block with the following live-in list
liveins: $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x16, $x17, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $lr

meaning the prologue actually clobbers a live register.
@fhahn fhahn force-pushed the aarch64-prologue-insert-clobber-check branch from 4321e9f to da84646 Compare November 17, 2023 13:28
@fhahn fhahn marked this pull request as ready for review November 17, 2023 13:34
@llvmbot
Copy link
Member

llvmbot commented Nov 17, 2023

@llvm/pr-subscribers-backend-aarch64

Author: Florian Hahn (fhahn)

Changes

This patch extends AArch64FrameLowering::emitProglogue to check if the inserted prologue clobbers live registers.

It updates llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir with an extra load to make x9 live before the store, preserving the original test.

It uses the original llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir as llvm/test/CodeGen/AArch64/emit-prologue-clobber-verification.mir, because there x9 is marked as live on entry, but used as scratch reg as it is not callee saved.

The new assertion catches a mis-compile in store-swift-async-context-clobber-live-reg.ll on https://github.com/apple/llvm-project/tree/next


Full diff: https://github.com/llvm/llvm-project/pull/71826.diff

5 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+46-1)
  • (added) llvm/test/CodeGen/AArch64/emit-prologue-clobber-verification.mir (+23)
  • (modified) llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir (+1)
  • (modified) llvm/test/CodeGen/AArch64/framelayout-sve-calleesaves-fix.mir (+1)
  • (added) llvm/test/CodeGen/AArch64/store-swift-async-context-clobber-live-reg.ll (+70)
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index b036f7582323700..99690c3cea7b2a0 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -1423,6 +1423,18 @@ static void emitDefineCFAWithFP(MachineFunction &MF, MachineBasicBlock &MBB,
       .setMIFlags(MachineInstr::FrameSetup);
 }
 
+/// Collect live registers from the end of \p MI's parent up to (including) \p
+/// MI in \p LiveRegs.
+static void getLivePhysRegsUpTo(MachineInstr &MI, const TargetRegisterInfo &TRI,
+                                LivePhysRegs &LiveRegs) {
+
+  MachineBasicBlock &MBB = *MI.getParent();
+  LiveRegs.addLiveOuts(MBB);
+  for (const MachineInstr &MI :
+       reverse(make_range(MI.getIterator(), MBB.instr_end())))
+    LiveRegs.stepBackward(MI);
+}
+
 void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
                                         MachineBasicBlock &MBB) const {
   MachineBasicBlock::iterator MBBI = MBB.begin();
@@ -1431,6 +1443,8 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
   const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
   const AArch64RegisterInfo *RegInfo = Subtarget.getRegisterInfo();
   const TargetInstrInfo *TII = Subtarget.getInstrInfo();
+  const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+
   MachineModuleInfo &MMI = MF.getMMI();
   AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
   bool EmitCFI = AFI->needsDwarfUnwindInfo(MF);
@@ -1440,6 +1454,25 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
   bool HasWinCFI = false;
   auto Cleanup = make_scope_exit([&]() { MF.setHasWinCFI(HasWinCFI); });
 
+  MachineBasicBlock::iterator End = MBB.end();
+#ifndef NDEBUG
+  // Collect live register from the end of MBB up to the start of the existing
+  // frame setup instructions.
+  MachineBasicBlock::iterator NonFrameStart = MBB.begin();
+  while (NonFrameStart != End &&
+         NonFrameStart->getFlag(MachineInstr::FrameSetup))
+    ++NonFrameStart;
+  LivePhysRegs LiveRegs(*TRI);
+  if (NonFrameStart != MBB.end()) {
+    getLivePhysRegsUpTo(*NonFrameStart, *TRI, LiveRegs);
+    // Ignore registers used for stack management for now.
+    LiveRegs.removeReg(AArch64::SP);
+    LiveRegs.removeReg(AArch64::X19);
+    LiveRegs.removeReg(AArch64::FP);
+    LiveRegs.removeReg(AArch64::LR);
+  }
+#endif
+
   bool IsFunclet = MBB.isEHFuncletEntry();
 
   // At this point, we're going to decide whether or not the function uses a
@@ -1608,7 +1641,6 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
   // Move past the saves of the callee-saved registers, fixing up the offsets
   // and pre-inc if we decided to combine the callee-save and local stack
   // pointer bump above.
-  MachineBasicBlock::iterator End = MBB.end();
   while (MBBI != End && MBBI->getFlag(MachineInstr::FrameSetup) &&
          !IsSVECalleeSave(MBBI)) {
     if (CombineSPBump)
@@ -1908,6 +1940,19 @@ void AArch64FrameLowering::emitPrologue(MachineFunction &MF,
     emitCalleeSavedGPRLocations(MBB, MBBI);
     emitCalleeSavedSVELocations(MBB, MBBI);
   }
+
+#ifndef NDEBUG
+  if (NonFrameStart != MBB.end()) {
+    // Check if any of the newly instructions clobber any of the live registers.
+    for (MachineInstr &MI :
+         make_range(MBB.instr_begin(), NonFrameStart->getIterator())) {
+      for (auto &Op : MI.operands())
+        if (Op.isReg() && Op.isDef())
+          assert(!LiveRegs.contains(Op.getReg()) &&
+                 "live register clobbered by inserted prologue instructions");
+    }
+  }
+#endif
 }
 
 static bool isFuncletReturnInstr(const MachineInstr &MI) {
diff --git a/llvm/test/CodeGen/AArch64/emit-prologue-clobber-verification.mir b/llvm/test/CodeGen/AArch64/emit-prologue-clobber-verification.mir
new file mode 100644
index 000000000000000..0e71347b9991288
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/emit-prologue-clobber-verification.mir
@@ -0,0 +1,23 @@
+# RUN: not llc -mtriple=aarch64-none-linux-gnu -run-pass=prologepilog %s -o -
+#
+# REQUIRES: asserts
+#
+---
+# x9 is marked as live on function entry, but it will be used as scratch
+# register for prologue computations at the beginning of the prologue.
+# Use this to check we catch that the prologue clobbers $x9.
+name: x9_clobbered_on_fn_entry
+frameInfo:
+  isFrameAddressTaken: true
+stack:
+  - { id: 0, size:    16, alignment: 16 }
+  - { id: 1, size: 32768, alignment: 32 }
+body: |
+  bb.0:
+    liveins: $x0, $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x16, $x17, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $lr
+    STRXui $x0, %stack.0, 0
+    B %bb.1
+  bb.1:
+    liveins: $x1, $x2, $x3, $x4, $x5, $x6, $x7, $x8, $x9, $x10, $x11, $x12, $x13, $x14, $x15, $x16, $x17, $x18, $x19, $x20, $x21, $x22, $x23, $x24, $x25, $x26, $x27, $x28, $lr
+    RET_ReallyLR implicit $lr
+...
diff --git a/llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir b/llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir
index 53fe9f0e61e4575..390582969d0264c 100644
--- a/llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir
+++ b/llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir
@@ -19,6 +19,7 @@ stack:
 body: |
   bb.0:
     liveins: $x0, $x8
+    $x9 = LDRXui $x0, 0 :: (load (s64))
     STRXui $x0, %stack.0, 0
     B %bb.1
   bb.1:
diff --git a/llvm/test/CodeGen/AArch64/framelayout-sve-calleesaves-fix.mir b/llvm/test/CodeGen/AArch64/framelayout-sve-calleesaves-fix.mir
index 7d7b3ace8a915cd..3dba21d59b4087e 100644
--- a/llvm/test/CodeGen/AArch64/framelayout-sve-calleesaves-fix.mir
+++ b/llvm/test/CodeGen/AArch64/framelayout-sve-calleesaves-fix.mir
@@ -30,6 +30,7 @@
   ; CHECK-NEXT:    ret
 ...
 name: fix_restorepoint_p4
+tracksRegLiveness: true
 stack:
   - { id: 0, stack-id: scalable-vector, size: 16, alignment: 16 }
 body:             |
diff --git a/llvm/test/CodeGen/AArch64/store-swift-async-context-clobber-live-reg.ll b/llvm/test/CodeGen/AArch64/store-swift-async-context-clobber-live-reg.ll
new file mode 100644
index 000000000000000..832e2a6eae74fcb
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/store-swift-async-context-clobber-live-reg.ll
@@ -0,0 +1,70 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+; RUN: llc -o - -mtriple=arm64e-apple-macosx %s | FileCheck %s
+
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+
+define swifttailcc void @test_async_with_jumptable(ptr %src, ptr swiftasync %as) #0 {
+; CHECK-LABEL: test_async_with_jumptable:
+; CHECK:       ; %bb.0: ; %entry
+; CHECK-NEXT:    orr x29, x29, #0x1000000000000000
+; CHECK-NEXT:    str x19, [sp, #-32]! ; 8-byte Folded Spill
+; CHECK-NEXT:    stp x29, x30, [sp, #16] ; 16-byte Folded Spill
+; CHECK-NEXT:    add x16, sp, #8
+; CHECK-NEXT:    movk x16, #49946, lsl #48
+; CHECK-NEXT:    mov x17, x22
+; CHECK-NEXT:    pacdb x17, x16
+; CHECK-NEXT:    str x17, [sp, #8]
+; CHECK-NEXT:    add x29, sp, #16
+; CHECK-NEXT:    .cfi_def_cfa w29, 16
+; CHECK-NEXT:    .cfi_offset w30, -8
+; CHECK-NEXT:    .cfi_offset w29, -16
+; CHECK-NEXT:    .cfi_offset w19, -32
+; CHECK-NEXT:    mov x20, x22
+; CHECK-NEXT:    ldr x8, [x0]
+; CHECK-NEXT:    cmp x8, #2
+; CHECK-NEXT:    csel x9, xzr, x0, eq
+; CHECK-NEXT:    cmp x8, #0
+; CHECK-NEXT:    csel x10, x22, xzr, eq
+; CHECK-NEXT:    cmp x8, #1
+; CHECK-NEXT:    csel x19, x9, x10, gt
+; CHECK-NEXT:    mov x22, x0
+; CHECK-NEXT:    bl _foo
+; CHECK-NEXT:    mov x2, x0
+; CHECK-NEXT:    mov x0, x19
+; CHECK-NEXT:    mov x1, x20
+; CHECK-NEXT:    ldp x29, x30, [sp, #16] ; 16-byte Folded Reload
+; CHECK-NEXT:    ldr x19, [sp], #32 ; 8-byte Folded Reload
+; CHECK-NEXT:    and x29, x29, #0xefffffffffffffff
+; CHECK-NEXT:    br x2
+entry:
+  %l = load i64, ptr %src, align 8
+  switch i64 %l, label %dead [
+    i64 0, label %exit
+    i64 1, label %then.1
+    i64 2, label %then.2
+    i64 3, label %then.3
+  ]
+
+then.1:
+  br label %exit
+
+then.2:
+  br label %exit
+
+then.3:
+  br label %exit
+
+dead:                                                ; preds = %entryresume.5
+  unreachable
+
+exit:
+  %p = phi ptr [ %src, %then.3 ], [ null, %then.2 ], [ %as, %entry ], [ null, %then.1 ]
+  %r = call i64 @foo()
+  %fn = inttoptr i64 %r to ptr
+  musttail call swifttailcc void %fn(ptr swiftasync %src, ptr %p, ptr %as)
+  ret void
+}
+
+declare i64 @foo()
+
+attributes #0 = { "frame-pointer"="non-leaf" }

@fhahn fhahn merged commit a842430 into llvm:main Nov 22, 2023
@fhahn fhahn deleted the aarch64-prologue-insert-clobber-check branch November 22, 2023 16:49
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 28, 2023
llvm#71826)

This patch extends AArch64FrameLowering::emitProglogue to check if the
inserted prologue clobbers live registers.

It updates `llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir`
with an extra load to make x9 live before the store, preserving the
original test.

It uses the original
`llvm/test/CodeGen/AArch64/framelayout-scavengingslot.mir` as
`llvm/test/CodeGen/AArch64/emit-prologue-clobber-verification.mir`,
because there x9 is marked as live on entry, but used as scratch reg as
it is not callee saved.

The new assertion catches a mis-compile in
`store-swift-async-context-clobber-live-reg.ll` on
https://github.com/apple/llvm-project/tree/next
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants