Skip to content

[AArch64] When hardening against SLS, only create called thunks #97472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 5, 2024

Conversation

atrosinenko
Copy link
Contributor

In preparation for implementing hardening of BLRA* instructions, restrict thunk function generation to only the thunks being actually called from any function. As described in the existing comments, emitting all possible thunks for BLRAA and BLRAB instructions would mean adding about 1800 functions in total, most of which are likely not to be called.

This commit merges AArch64SLSHardening class into SLSBLRThunkInserter, so thunks can be created as needed while rewriting a machine function. The usages of TII, TRI and ST fields of AArch64SLSHardening class are replaced with requesting them in-place, as ThunkInserter assumes multiple "entry points" in contrast to the only runOnMachineFunction method of AArch64SLSHardening.

The runOnMachineFunction method essentially replaces pre-existing insertThunks implementation as there is no more need to insert all possible thunks unconditionally. Instead, thunks are created on first use from inside of insertThunks method.

@llvmbot
Copy link
Member

llvmbot commented Jul 2, 2024

@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-arm

@llvm/pr-subscribers-backend-aarch64

Author: Anatoly Trosinenko (atrosinenko)

Changes

In preparation for implementing hardening of BLRA* instructions, restrict thunk function generation to only the thunks being actually called from any function. As described in the existing comments, emitting all possible thunks for BLRAA and BLRAB instructions would mean adding about 1800 functions in total, most of which are likely not to be called.

This commit merges AArch64SLSHardening class into SLSBLRThunkInserter, so thunks can be created as needed while rewriting a machine function. The usages of TII, TRI and ST fields of AArch64SLSHardening class are replaced with requesting them in-place, as ThunkInserter assumes multiple "entry points" in contrast to the only runOnMachineFunction method of AArch64SLSHardening.

The runOnMachineFunction method essentially replaces pre-existing insertThunks implementation as there is no more need to insert all possible thunks unconditionally. Instead, thunks are created on first use from inside of insertThunks method.


Full diff: https://github.com/llvm/llvm-project/pull/97472.diff

8 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64.h (-1)
  • (modified) llvm/lib/Target/AArch64/AArch64SLSHardening.cpp (+72-116)
  • (modified) llvm/lib/Target/AArch64/AArch64TargetMachine.cpp (-1)
  • (modified) llvm/test/CodeGen/AArch64/O0-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/AArch64/O3-pipeline.ll (-1)
  • (modified) llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll (-8)
  • (modified) llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr-bti.mir (-20)
  • (modified) llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr.mir (+3-17)
diff --git a/llvm/lib/Target/AArch64/AArch64.h b/llvm/lib/Target/AArch64/AArch64.h
index 6f2aeb83a451a..66ad701d83958 100644
--- a/llvm/lib/Target/AArch64/AArch64.h
+++ b/llvm/lib/Target/AArch64/AArch64.h
@@ -40,7 +40,6 @@ FunctionPass *createAArch64ISelDag(AArch64TargetMachine &TM,
 FunctionPass *createAArch64StorePairSuppressPass();
 FunctionPass *createAArch64ExpandPseudoPass();
 FunctionPass *createAArch64SLSHardeningPass();
-FunctionPass *createAArch64IndirectThunks();
 FunctionPass *createAArch64SpeculationHardeningPass();
 FunctionPass *createAArch64LoadStoreOptimizationPass();
 ModulePass *createAArch64LowerHomogeneousPrologEpilogPass();
diff --git a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
index d35386eaeab12..b4ebd7d5377c2 100644
--- a/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
+++ b/llvm/lib/Target/AArch64/AArch64SLSHardening.cpp
@@ -13,20 +13,16 @@
 
 #include "AArch64InstrInfo.h"
 #include "AArch64Subtarget.h"
-#include "Utils/AArch64BaseInfo.h"
 #include "llvm/CodeGen/IndirectThunks.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
 #include "llvm/CodeGen/MachineFunction.h"
-#include "llvm/CodeGen/MachineFunctionPass.h"
 #include "llvm/CodeGen/MachineInstr.h"
 #include "llvm/CodeGen/MachineInstrBuilder.h"
 #include "llvm/CodeGen/MachineOperand.h"
-#include "llvm/CodeGen/MachineRegisterInfo.h"
 #include "llvm/CodeGen/RegisterScavenging.h"
 #include "llvm/IR/DebugLoc.h"
 #include "llvm/Pass.h"
-#include "llvm/Support/CodeGen.h"
-#include "llvm/Support/Debug.h"
+#include "llvm/Support/ErrorHandling.h"
 #include "llvm/Target/TargetMachine.h"
 #include <cassert>
 
@@ -36,38 +32,43 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
+static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+
 namespace {
 
-class AArch64SLSHardening : public MachineFunctionPass {
-public:
-  const TargetInstrInfo *TII;
-  const TargetRegisterInfo *TRI;
-  const AArch64Subtarget *ST;
+// Set of inserted thunks: bitmask with bits corresponding to
+// indexes in SLSBLRThunks array.
+typedef uint32_t ThunksSet;
 
-  static char ID;
-
-  AArch64SLSHardening() : MachineFunctionPass(ID) {
-    initializeAArch64SLSHardeningPass(*PassRegistry::getPassRegistry());
+struct SLSBLRThunkInserter : ThunkInserter<SLSBLRThunkInserter, ThunksSet> {
+public:
+  const char *getThunkPrefix() { return SLSBLRNamePrefix; }
+  bool mayUseThunk(const MachineFunction &MF) {
+    // FIXME: ComdatThunks is only accumulated until the first thunk is created.
+    ComdatThunks &= !MF.getSubtarget<AArch64Subtarget>().hardenSlsNoComdat();
+    // We are inserting barriers aside from thunk calls, so
+    // check hardenSlsRetBr() as well.
+    return MF.getSubtarget<AArch64Subtarget>().hardenSlsBlr() ||
+           MF.getSubtarget<AArch64Subtarget>().hardenSlsRetBr();
   }
+  ThunksSet insertThunks(MachineModuleInfo &MMI, MachineFunction &MF,
+                         ThunksSet ExistingThunks);
+  void populateThunk(MachineFunction &MF);
 
-  bool runOnMachineFunction(MachineFunction &Fn) override;
+private:
+  bool ComdatThunks = true;
 
-  StringRef getPassName() const override { return AARCH64_SLS_HARDENING_NAME; }
+  bool hardenReturnsAndBRs(MachineModuleInfo &MMI, MachineBasicBlock &MBB);
+  bool hardenBLRs(MachineModuleInfo &MMI, MachineBasicBlock &MBB,
+                  ThunksSet &Thunks);
 
-private:
-  bool hardenReturnsAndBRs(MachineBasicBlock &MBB) const;
-  bool hardenBLRs(MachineBasicBlock &MBB) const;
-  MachineBasicBlock &ConvertBLRToBL(MachineBasicBlock &MBB,
-                                    MachineBasicBlock::instr_iterator) const;
+  void convertBLRToBL(MachineModuleInfo &MMI, MachineBasicBlock &MBB,
+                      MachineBasicBlock::instr_iterator MBBI,
+                      ThunksSet &Thunks);
 };
 
 } // end anonymous namespace
 
-char AArch64SLSHardening::ID = 0;
-
-INITIALIZE_PASS(AArch64SLSHardening, "aarch64-sls-hardening",
-                AARCH64_SLS_HARDENING_NAME, false, false)
-
 static void insertSpeculationBarrier(const AArch64Subtarget *ST,
                                      MachineBasicBlock &MBB,
                                      MachineBasicBlock::iterator MBBI,
@@ -90,18 +91,18 @@ static void insertSpeculationBarrier(const AArch64Subtarget *ST,
     BuildMI(MBB, MBBI, DL, TII->get(BarrierOpc));
 }
 
-bool AArch64SLSHardening::runOnMachineFunction(MachineFunction &MF) {
-  ST = &MF.getSubtarget<AArch64Subtarget>();
-  TII = MF.getSubtarget().getInstrInfo();
-  TRI = MF.getSubtarget().getRegisterInfo();
+ThunksSet SLSBLRThunkInserter::insertThunks(MachineModuleInfo &MMI,
+                                            MachineFunction &MF,
+                                            ThunksSet ExistingThunks) {
+  const AArch64Subtarget *ST = &MF.getSubtarget<AArch64Subtarget>();
 
-  bool Modified = false;
   for (auto &MBB : MF) {
-    Modified |= hardenReturnsAndBRs(MBB);
-    Modified |= hardenBLRs(MBB);
+    if (ST->hardenSlsRetBr())
+      hardenReturnsAndBRs(MMI, MBB);
+    if (ST->hardenSlsBlr())
+      hardenBLRs(MMI, MBB, ExistingThunks);
   }
-
-  return Modified;
+  return ExistingThunks;
 }
 
 static bool isBLR(const MachineInstr &MI) {
@@ -120,9 +121,10 @@ static bool isBLR(const MachineInstr &MI) {
   return false;
 }
 
-bool AArch64SLSHardening::hardenReturnsAndBRs(MachineBasicBlock &MBB) const {
-  if (!ST->hardenSlsRetBr())
-    return false;
+bool SLSBLRThunkInserter::hardenReturnsAndBRs(MachineModuleInfo &MMI,
+                                              MachineBasicBlock &MBB) {
+  const AArch64Subtarget *ST =
+      &MBB.getParent()->getSubtarget<AArch64Subtarget>();
   bool Modified = false;
   MachineBasicBlock::iterator MBBI = MBB.getFirstTerminator(), E = MBB.end();
   MachineBasicBlock::iterator NextMBBI;
@@ -138,12 +140,11 @@ bool AArch64SLSHardening::hardenReturnsAndBRs(MachineBasicBlock &MBB) const {
   return Modified;
 }
 
-static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
-
+static const unsigned NumPermittedRegs = 29;
 static const struct ThunkNameAndReg {
   const char* Name;
   Register Reg;
-} SLSBLRThunks[] = {
+} SLSBLRThunks[NumPermittedRegs] = {
   { "__llvm_slsblr_thunk_x0",  AArch64::X0},
   { "__llvm_slsblr_thunk_x1",  AArch64::X1},
   { "__llvm_slsblr_thunk_x2",  AArch64::X2},
@@ -180,34 +181,11 @@ static const struct ThunkNameAndReg {
   { "__llvm_slsblr_thunk_x31",  AArch64::XZR},
 };
 
-namespace {
-struct SLSBLRThunkInserter : ThunkInserter<SLSBLRThunkInserter> {
-  const char *getThunkPrefix() { return SLSBLRNamePrefix; }
-  bool mayUseThunk(const MachineFunction &MF) {
-    // FIXME: ComdatThunks is only accumulated until the first thunk is created.
-    ComdatThunks &= !MF.getSubtarget<AArch64Subtarget>().hardenSlsNoComdat();
-    return MF.getSubtarget<AArch64Subtarget>().hardenSlsBlr();
-  }
-  bool insertThunks(MachineModuleInfo &MMI, MachineFunction &MF,
-                    bool ExistingThunks);
-  void populateThunk(MachineFunction &MF);
-
-private:
-  bool ComdatThunks = true;
-};
-} // namespace
-
-bool SLSBLRThunkInserter::insertThunks(MachineModuleInfo &MMI,
-                                       MachineFunction &MF,
-                                       bool ExistingThunks) {
-  if (ExistingThunks)
-    return false;
-  // FIXME: It probably would be possible to filter which thunks to produce
-  // based on which registers are actually used in BLR instructions in this
-  // function. But would that be a worthwhile optimization?
-  for (auto T : SLSBLRThunks)
-    createThunkFunction(MMI, T.Name, ComdatThunks);
-  return true;
+unsigned getThunkIndex(Register Reg) {
+  for (unsigned I = 0; I < NumPermittedRegs; ++I)
+    if (SLSBLRThunks[I].Reg == Reg)
+      return I;
+  llvm_unreachable("Unexpected register");
 }
 
 void SLSBLRThunkInserter::populateThunk(MachineFunction &MF) {
@@ -257,8 +235,10 @@ void SLSBLRThunkInserter::populateThunk(MachineFunction &MF) {
                            Entry->end(), DebugLoc(), true /*AlwaysUseISBDSB*/);
 }
 
-MachineBasicBlock &AArch64SLSHardening::ConvertBLRToBL(
-    MachineBasicBlock &MBB, MachineBasicBlock::instr_iterator MBBI) const {
+void SLSBLRThunkInserter::convertBLRToBL(MachineModuleInfo &MMI,
+                                         MachineBasicBlock &MBB,
+                                         MachineBasicBlock::instr_iterator MBBI,
+                                         ThunksSet &Thunks) {
   // Transform a BLR to a BL as follows:
   // Before:
   //   |-----------------------------|
@@ -317,37 +297,16 @@ MachineBasicBlock &AArch64SLSHardening::ConvertBLRToBL(
   }
   DebugLoc DL = BLR.getDebugLoc();
 
-  // If we'd like to support also BLRAA and BLRAB instructions, we'd need
-  // a lot more different kind of thunks.
-  // For example, a
-  //
-  // BLRAA xN, xM
-  //
-  // instruction probably would need to be transformed to something like:
-  //
-  // BL __llvm_slsblraa_thunk_x<N>_x<M>
-  //
-  // __llvm_slsblraa_thunk_x<N>_x<M>:
-  //   BRAA x<N>, x<M>
-  //   barrierInsts
-  //
-  // Given that about 30 different values of N are possible and about 30
-  // different values of M are possible in the above, with the current way
-  // of producing indirect thunks, we'd be producing about 30 times 30, i.e.
-  // about 900 thunks (where most might not be actually called). This would
-  // multiply further by two to support both BLRAA and BLRAB variants of those
-  // instructions.
-  // If we'd want to support this, we'd probably need to look into a different
-  // way to produce thunk functions, based on which variants are actually
-  // needed, rather than producing all possible variants.
-  // So far, LLVM does never produce BLRA* instructions, so let's leave this
-  // for the future when LLVM can start producing BLRA* instructions.
   MachineFunction &MF = *MBBI->getMF();
   MCContext &Context = MBB.getParent()->getContext();
-  auto ThunkIt =
-      llvm::find_if(SLSBLRThunks, [Reg](auto T) { return T.Reg == Reg; });
-  assert (ThunkIt != std::end(SLSBLRThunks));
-  MCSymbol *Sym = Context.getOrCreateSymbol(ThunkIt->Name);
+  const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+  unsigned ThunkIndex = getThunkIndex(Reg);
+  StringRef ThunkName = SLSBLRThunks[ThunkIndex].Name;
+  MCSymbol *Sym = Context.getOrCreateSymbol(ThunkName);
+  if (!(Thunks & (1u << ThunkIndex))) {
+    Thunks |= 1u << ThunkIndex;
+    createThunkFunction(MMI, ThunkName, ComdatThunks);
+  }
 
   MachineInstr *BL = BuildMI(MBB, MBBI, DL, TII->get(BLOpcode)).addSym(Sym);
 
@@ -385,13 +344,11 @@ MachineBasicBlock &AArch64SLSHardening::ConvertBLRToBL(
                                            RegIsKilled /*isKill*/));
   // Remove BLR instruction
   MBB.erase(MBBI);
-
-  return MBB;
 }
 
-bool AArch64SLSHardening::hardenBLRs(MachineBasicBlock &MBB) const {
-  if (!ST->hardenSlsBlr())
-    return false;
+bool SLSBLRThunkInserter::hardenBLRs(MachineModuleInfo &MMI,
+                                     MachineBasicBlock &MBB,
+                                     ThunksSet &Thunks) {
   bool Modified = false;
   MachineBasicBlock::instr_iterator MBBI = MBB.instr_begin(),
                                     E = MBB.instr_end();
@@ -400,31 +357,30 @@ bool AArch64SLSHardening::hardenBLRs(MachineBasicBlock &MBB) const {
     MachineInstr &MI = *MBBI;
     NextMBBI = std::next(MBBI);
     if (isBLR(MI)) {
-      ConvertBLRToBL(MBB, MBBI);
+      convertBLRToBL(MMI, MBB, MBBI, Thunks);
       Modified = true;
     }
   }
   return Modified;
 }
 
-FunctionPass *llvm::createAArch64SLSHardeningPass() {
-  return new AArch64SLSHardening();
-}
-
 namespace {
-class AArch64IndirectThunks : public ThunkInserterPass<SLSBLRThunkInserter> {
+class AArch64SLSHardening : public ThunkInserterPass<SLSBLRThunkInserter> {
 public:
   static char ID;
 
-  AArch64IndirectThunks() : ThunkInserterPass(ID) {}
+  AArch64SLSHardening() : ThunkInserterPass(ID) {}
 
-  StringRef getPassName() const override { return "AArch64 Indirect Thunks"; }
+  StringRef getPassName() const override { return AARCH64_SLS_HARDENING_NAME; }
 };
 
 } // end anonymous namespace
 
-char AArch64IndirectThunks::ID = 0;
+char AArch64SLSHardening::ID = 0;
+
+INITIALIZE_PASS(AArch64SLSHardening, "aarch64-sls-hardening",
+                AARCH64_SLS_HARDENING_NAME, false, false)
 
-FunctionPass *llvm::createAArch64IndirectThunks() {
-  return new AArch64IndirectThunks();
+FunctionPass *llvm::createAArch64SLSHardeningPass() {
+  return new AArch64SLSHardening();
 }
diff --git a/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp b/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
index 37ce07d4a09de..bcd677310d124 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
@@ -861,7 +861,6 @@ void AArch64PassConfig::addPreEmitPass() {
 }
 
 void AArch64PassConfig::addPostBBSections() {
-  addPass(createAArch64IndirectThunks());
   addPass(createAArch64SLSHardeningPass());
   addPass(createAArch64PointerAuthPass());
   if (EnableBranchTargets)
diff --git a/llvm/test/CodeGen/AArch64/O0-pipeline.ll b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
index a0306b8e1e924..78a7b84b8479b 100644
--- a/llvm/test/CodeGen/AArch64/O0-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O0-pipeline.ll
@@ -74,7 +74,6 @@
 ; CHECK-NEXT:       StackMap Liveness Analysis
 ; CHECK-NEXT:       Live DEBUG_VALUE analysis
 ; CHECK-NEXT:       Machine Sanitizer Binary Metadata
-; CHECK-NEXT:       AArch64 Indirect Thunks
 ; CHECK-NEXT:       AArch64 sls hardening pass
 ; CHECK-NEXT:       AArch64 Pointer Authentication
 ; CHECK-NEXT:       AArch64 Branch Targets
diff --git a/llvm/test/CodeGen/AArch64/O3-pipeline.ll b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
index 84e672d14d99d..c5d604a5a2783 100644
--- a/llvm/test/CodeGen/AArch64/O3-pipeline.ll
+++ b/llvm/test/CodeGen/AArch64/O3-pipeline.ll
@@ -227,7 +227,6 @@
 ; CHECK-NEXT:       Machine Sanitizer Binary Metadata
 ; CHECK-NEXT:     Machine Outliner
 ; CHECK-NEXT:     FunctionPass Manager
-; CHECK-NEXT:       AArch64 Indirect Thunks
 ; CHECK-NEXT:       AArch64 sls hardening pass
 ; CHECK-NEXT:       AArch64 Pointer Authentication
 ; CHECK-NEXT:       AArch64 Branch Targets
diff --git a/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll b/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll
index 3ffaf962425b3..08c314e538734 100644
--- a/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-opt-remarks-lazy-bfi.ll
@@ -34,10 +34,6 @@
 ; HOTNESS-NEXT:  Executing Pass 'Function Pass Manager'
 ; HOTNESS-NEXT: Executing Pass 'Verify generated machine code' on Function 'empty_func'...
 ; HOTNESS-NEXT:  Freeing Pass 'Verify generated machine code' on Function 'empty_func'...
-; HOTNESS-NEXT: Executing Pass 'AArch64 Indirect Thunks' on Function 'empty_func'...
-; HOTNESS-NEXT:  Freeing Pass 'AArch64 Indirect Thunks' on Function 'empty_func'...
-; HOTNESS-NEXT: Executing Pass 'Verify generated machine code' on Function 'empty_func'...
-; HOTNESS-NEXT:  Freeing Pass 'Verify generated machine code' on Function 'empty_func'...
 ; HOTNESS-NEXT: Executing Pass 'AArch64 sls hardening pass' on Function 'empty_func'...
 ; HOTNESS-NEXT:  Freeing Pass 'AArch64 sls hardening pass' on Function 'empty_func'...
 ; HOTNESS-NEXT: Executing Pass 'Verify generated machine code' on Function 'empty_func'...
@@ -83,10 +79,6 @@
 ; NO_HOTNESS-NEXT:  Executing Pass 'Function Pass Manager'
 ; NO_HOTNESS-NEXT: Executing Pass 'Verify generated machine code' on Function 'empty_func'...
 ; NO_HOTNESS-NEXT:  Freeing Pass 'Verify generated machine code' on Function 'empty_func'...
-; NO_HOTNESS-NEXT: Executing Pass 'AArch64 Indirect Thunks' on Function 'empty_func'...
-; NO_HOTNESS-NEXT:  Freeing Pass 'AArch64 Indirect Thunks' on Function 'empty_func'...
-; NO_HOTNESS-NEXT: Executing Pass 'Verify generated machine code' on Function 'empty_func'...
-; NO_HOTNESS-NEXT:  Freeing Pass 'Verify generated machine code' on Function 'empty_func'...
 ; NO_HOTNESS-NEXT: Executing Pass 'AArch64 sls hardening pass' on Function 'empty_func'...
 ; NO_HOTNESS-NEXT:  Freeing Pass 'AArch64 sls hardening pass' on Function 'empty_func'...
 ; NO_HOTNESS-NEXT: Executing Pass 'Verify generated machine code' on Function 'empty_func'...
diff --git a/llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr-bti.mir b/llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr-bti.mir
index 92353b648943a..46e42bbe6bf82 100644
--- a/llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr-bti.mir
+++ b/llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr-bti.mir
@@ -8,8 +8,6 @@
 # These BLR/BTI bundles are produced when calling a returns_twice function
 # (like setjmp) indirectly.
 --- |
-  $__llvm_slsblr_thunk_x8 = comdat any
-
   define dso_local void @fn() #0 {
   entry:
     %fnptr = alloca ptr, align 8
@@ -22,15 +20,8 @@
   ; Function Attrs: returns_twice
   declare i32 @setjmp(ptr noundef) #1
 
-  ; Function Attrs: naked nounwind
-  define linkonce_odr hidden void @__llvm_slsblr_thunk_x8() #2 comdat {
-  entry:
-    ret void
-  }
-
   attributes #0 = { "target-features"="+harden-sls-blr" }
   attributes #1 = { returns_twice }
-  attributes #2 = { naked nounwind }
 
   !llvm.module.flags = !{!0}
   !0 = !{i32 8, !"branch-target-enforcement", i32 1}
@@ -75,14 +66,3 @@ body:             |
     early-clobber $sp, $lr = frame-destroy LDRXpost $sp, 16 :: (load (s64) from %stack.1)
     RET undef $lr
 ...
----
-name:            __llvm_slsblr_thunk_x8
-tracksRegLiveness: true
-body:             |
-  bb.0.entry:
-    liveins: $x8
-
-    $x16 = ORRXrs $xzr, $x8, 0
-    BR $x16
-    SpeculationBarrierISBDSBEndBB
-...
diff --git a/llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr.mir b/llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr.mir
index 81f95348f511e..e5e90582af098 100644
--- a/llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr.mir
+++ b/llvm/test/CodeGen/AArch64/speculation-hardening-sls-blr.mir
@@ -1,12 +1,12 @@
 # RUN: llc -verify-machineinstrs -mtriple=aarch64-none-linux-gnu \
 # RUN:     -start-before aarch64-sls-hardening \
 # RUN:     -stop-after aarch64-sls-hardening -o - %s \
-# RUN:   | FileCheck %s --check-prefixes=CHECK
+# RUN:   | FileCheck %s --check-prefixes=CHECK \
+# RUN:                  --implicit-check-not=__llvm_slsblr_thunk_x7
 
 # Check that the BLR SLS hardening transforms a BLR into a BL with operands as
 # expected.
 --- |
-  $__llvm_slsblr_thunk_x8 = comdat any
   @a = dso_local local_unnamed_addr global i32 (...)* null, align 8
   @b = dso_local local_unnamed_addr global i32 0, align 4
 
@@ -17,12 +17,6 @@
     store i32 %call, i32* @b, align 4
     ret void
   }
-
-  ; Function Attrs: naked nounwind
-  define linkonce_odr hidden void @__llvm_slsblr_thunk_x8() naked nounwind comdat {
-  entry:
-    ret void
-  }
 ...
 ---
 name:            fn1
@@ -46,13 +40,5 @@ body:             |
 
 
 ...
----
-name:            __llvm_slsblr_thunk_x8
-tracksRegLiveness: true
-body:             |
-  bb.0.entry:
-    liveins: $x8
 
-    BR $x8
-    SpeculationBarrierISBDSBEndBB
-...
+# CHECK: name: __llvm_slsblr_thunk_x8


BR $x8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is important to check that the body of the thunk inserted is as expected.
What is the motivation to remove that check, and only check that a thunk with the expected name was inserted, without checking its body?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above lines are input to llc, so I removed them to not conflict with the thunks that are inserted by the pass. Unfortunately, MIR syntax is quite verbose, so I added a few basic CHECK lines testing MIR output and duplicated the RUN lines to generate and check assembly output as well.

Now there is no separation between function-rewriting and thunk-inserting passes, and I assumed it is safe to expect that initially there are no thunks in the module. I guess the original version of thunk insertion logic would assert on pre-existing thunks as well (due to unknown thunk names with .1 suffix).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes, I was confused and didn't spot these were input lines....
I looked a bit more now at all of the existing SLS-hardening tests, and the content of the thunk is already tested in

; HARDEN-label: __llvm_slsblr_thunk_x0:

So, there isn't really a need to test for it separately here.
Apologies for giving you more work here!

I think it's fine for the test to go in as is; and it's equally fine for the test to go in without the CHECK lines for the content of the thunk. I'll leave it to you to make that decision.


AArch64SLSHardening() : MachineFunctionPass(ID) {
initializeAArch64SLSHardeningPass(*PassRegistry::getPassRegistry());
struct SLSBLRThunkInserter : ThunkInserter<SLSBLRThunkInserter, ThunksSet> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if SLSBLRThunkInserter is the most appropriate name for this class.
IIUC, this class will also harden Returns and BRs, (see method hardenReturnsAndBRs), which does not insert thunks.
If I understand this correctly, I think sticking with the original name of "AArch64SLSHardening" describes better what this class does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it to SLSHardeningInserter, so the AArch64SLSHardening name is kept for the pass itself.

Copy link
Contributor Author

@atrosinenko atrosinenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbeyls thank you for the comments!


BR $x8
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above lines are input to llc, so I removed them to not conflict with the thunks that are inserted by the pass. Unfortunately, MIR syntax is quite verbose, so I added a few basic CHECK lines testing MIR output and duplicated the RUN lines to generate and check assembly output as well.

Now there is no separation between function-rewriting and thunk-inserting passes, and I assumed it is safe to expect that initially there are no thunks in the module. I guess the original version of thunk insertion logic would assert on pre-existing thunks as well (due to unknown thunk names with .1 suffix).


AArch64SLSHardening() : MachineFunctionPass(ID) {
initializeAArch64SLSHardeningPass(*PassRegistry::getPassRegistry());
struct SLSBLRThunkInserter : ThunkInserter<SLSBLRThunkInserter, ThunksSet> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed it to SLSHardeningInserter, so the AArch64SLSHardening name is kept for the pass itself.

Base automatically changed from users/atrosinenko/refactor-thunk-inserter to main July 4, 2024 14:03
@atrosinenko atrosinenko force-pushed the users/atrosinenko/aarch64-sls-hardening branch from 0503385 to 29d6b44 Compare July 4, 2024 14:23
@atrosinenko atrosinenko force-pushed the users/atrosinenko/aarch64-sls-hardening branch from 29d6b44 to 0247af9 Compare July 4, 2024 14:26
Copy link

github-actions bot commented Jul 4, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! (after fixing the clang-format failure)


BR $x8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes, I was confused and didn't spot these were input lines....
I looked a bit more now at all of the existing SLS-hardening tests, and the content of the thunk is already tested in

; HARDEN-label: __llvm_slsblr_thunk_x0:

So, there isn't really a need to test for it separately here.
Apologies for giving you more work here!

I think it's fine for the test to go in as is; and it's equally fine for the test to go in without the CHECK lines for the content of the thunk. I'll leave it to you to make that decision.

In preparation for implementing hardening of BLRA* instructions,
restrict thunk function generation to only the thunks being actually
called from any function. As described in the existing comments,
emitting all possible thunks for BLRAA and BLRAB instructions would
mean adding about 1800 functions in total, most of which are likely
not to be called.

This commit merges AArch64SLSHardening class into SLSBLRThunkInserter,
so thunks can be created as needed while rewriting a machine function.
The usages of TII, TRI and ST fields of AArch64SLSHardening class are
replaced with requesting them in-place, as ThunkInserter assumes
multiple "entry points" in contrast to the only runOnMachineFunction
method of AArch64SLSHardening.

The runOnMachineFunction method essentially replaces pre-existing
insertThunks implementation as there is no more need to insert all
possible thunks unconditionally. Instead, thunks are created on first
use from inside of insertThunks method.
@atrosinenko atrosinenko force-pushed the users/atrosinenko/aarch64-sls-hardening branch from 0247af9 to 28a8541 Compare July 4, 2024 16:22
@atrosinenko atrosinenko merged commit cc53b95 into main Jul 5, 2024
7 checks passed
@atrosinenko atrosinenko deleted the users/atrosinenko/aarch64-sls-hardening branch July 5, 2024 10:12
@atrosinenko atrosinenko changed the title [AArch64] Only create called thunks when hardening against SLS [AArch64] When hardening against SLS, only create called thunks Jul 5, 2024
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
…#97472)

In preparation for implementing hardening of BLRA* instructions,
restrict thunk function generation to only the thunks being actually
called from any function. As described in the existing comments,
emitting all possible thunks for BLRAA and BLRAB instructions would mean
adding about 1800 functions in total, most of which are likely not to be
called.

This commit merges AArch64SLSHardening class into SLSBLRThunkInserter,
so thunks can be created as needed while rewriting a machine function.
The usages of TII, TRI and ST fields of AArch64SLSHardening class are
replaced with requesting them in-place, as ThunkInserter assumes
multiple "entry points" in contrast to the only runOnMachineFunction
method of AArch64SLSHardening.

The runOnMachineFunction method essentially replaces pre-existing
insertThunks implementation as there is no more need to insert all
possible thunks unconditionally. Instead, thunks are created on first
use from inside of insertThunks method.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants