Skip to content

[BOLT] Gadget scanner: detect signing oracles #134146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 20, 2025

Conversation

atrosinenko
Copy link
Contributor

Implement the detection of signing oracles. In this patch, a signing
oracle is defined as a sign instruction that accepts a "non-protected"
pointer, but for a slightly different definition of "non-protected"
compared to control flow instructions.

A second BitVector named TrustedRegs is added to the register state
computed by the data-flow analysis. The difference between a
"safe-to-dereference" and a "trusted" register states is that to make
an unsafe register trusted by authentication, one has to make sure
that the authentication succeeded. For example, on AArch64 without
FEAT_PAuth2 and FEAT_EPAC, an authentication instruction produces an
invalid pointer on failure, so that subsequent memory access triggers
an error, but re-signing such pointer would "fix" the signature.

Note that while a separate "trusted" register state may be redundant
depending on the specific semantics of auth and sign operations, it is
still important to check signing operations: while code like this

resign:
  autda x0, x1
  pacda x0, x2
  ret

is probably safe provided autda generates an error on authentication
failure, this function

sign_anything:
  pacda x0, x1
  ret

is inherently unsafe.

Copy link
Contributor Author

atrosinenko commented Apr 2, 2025

@llvmbot
Copy link
Member

llvmbot commented Apr 2, 2025

@llvm/pr-subscribers-bolt

Author: Anatoly Trosinenko (atrosinenko)

Changes

Implement the detection of signing oracles. In this patch, a signing
oracle is defined as a sign instruction that accepts a "non-protected"
pointer, but for a slightly different definition of "non-protected"
compared to control flow instructions.

A second BitVector named TrustedRegs is added to the register state
computed by the data-flow analysis. The difference between a
"safe-to-dereference" and a "trusted" register states is that to make
an unsafe register trusted by authentication, one has to make sure
that the authentication succeeded. For example, on AArch64 without
FEAT_PAuth2 and FEAT_EPAC, an authentication instruction produces an
invalid pointer on failure, so that subsequent memory access triggers
an error, but re-signing such pointer would "fix" the signature.

Note that while a separate "trusted" register state may be redundant
depending on the specific semantics of auth and sign operations, it is
still important to check signing operations: while code like this

resign:
  autda x0, x1
  pacda x0, x2
  ret

is probably safe provided autda generates an error on authentication
failure, this function

sign_anything:
  pacda x0, x1
  ret

is inherently unsafe.


Patch is 107.43 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/134146.diff

6 Files Affected:

  • (modified) bolt/include/bolt/Core/MCPlusBuilder.h (+40)
  • (modified) bolt/lib/Passes/PAuthGadgetScanner.cpp (+144-7)
  • (modified) bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp (+207)
  • (added) bolt/test/binary-analysis/AArch64/gs-pauth-address-checks.s (+637)
  • (modified) bolt/test/binary-analysis/AArch64/gs-pauth-debug-output.s (+71-68)
  • (added) bolt/test/binary-analysis/AArch64/gs-pauth-signing-oracles.s (+993)
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 0749919cfc570..c205030635803 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -49,6 +49,7 @@ class MCSymbol;
 class raw_ostream;
 
 namespace bolt {
+class BinaryBasicBlock;
 class BinaryFunction;
 
 /// Different types of indirect branches encountered during disassembly.
@@ -572,6 +573,11 @@ class MCPlusBuilder {
     return false;
   }
 
+  virtual MCPhysReg getSignedReg(const MCInst &Inst) const {
+    llvm_unreachable("not implemented");
+    return getNoRegister();
+  }
+
   virtual ErrorOr<MCPhysReg> getRegUsedAsRetDest(const MCInst &Inst) const {
     llvm_unreachable("not implemented");
     return getNoRegister();
@@ -615,6 +621,40 @@ class MCPlusBuilder {
     return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional<std::pair<MCPhysReg, MCInst *>>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const {
+    llvm_unreachable("not implemented");
+    return std::nullopt;
+  }
+
+  /// Returns the register that is checked to be authenticated successfully.
+  ///
+  /// If the returned register was safe-to-dereference before execution of Inst,
+  /// it becomes trusted afterward (if MayOverwrite is false) or at least does
+  /// not escape in a way usable as an authentication oracle (if MayOverwrite
+  /// is true).
+  virtual MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+                                      bool MayOverwrite) const {
+    llvm_unreachable("not implemented");
+    return getNoRegister();
+  }
+
   virtual bool isTerminator(const MCInst &Inst) const;
 
   virtual bool isNoop(const MCInst &Inst) const {
diff --git a/bolt/lib/Passes/PAuthGadgetScanner.cpp b/bolt/lib/Passes/PAuthGadgetScanner.cpp
index a768521fb13dc..2077f046b2c74 100644
--- a/bolt/lib/Passes/PAuthGadgetScanner.cpp
+++ b/bolt/lib/Passes/PAuthGadgetScanner.cpp
@@ -194,10 +194,24 @@ template <typename T> static void iterateOverInstrs(BinaryFunction &BF, T Fn) {
 ///   X30 is safe-to-dereference - the state computed for sub- and
 ///   super-registers is not inspected.
 struct State {
-  /// A BitVector containing the registers that are either safe at function
-  /// entry and were not clobbered yet, or those not clobbered since being
-  /// authenticated.
+  /// A BitVector containing the registers that are either authenticated
+  /// (assuming failed authentication is permitted to produce an invalid
+  /// address, provided it generates an error on memory access) or whose
+  /// value is known not to be attacker-controlled under Pointer Authentication
+  /// threat model. The registers in this set are either
+  /// * not clobbered since being authenticated, or
+  /// * trusted at function entry and were not clobbered yet, or
+  /// * contain a safely materialized address.
   BitVector SafeToDerefRegs;
+  /// A BitVector containing the registers that are either authenticated
+  /// *successfully* or whose value is known not to be attacker-controlled
+  /// under Pointer Authentication threat model.
+  /// The registers in this set are either
+  /// * authenticated and then checked to be authenticated successfully
+  ///   (and not clobbered since then), or
+  /// * trusted at function entry and were not clobbered yet, or
+  /// * contain a safely materialized address.
+  BitVector TrustedRegs;
   /// A vector of sets, only used in the second data flow run.
   /// Each element in the vector represents one of the registers for which we
   /// track the set of last instructions that wrote to this register. For
@@ -210,7 +224,8 @@ struct State {
   State() {}
 
   State(unsigned NumRegs, unsigned NumRegsToTrack)
-      : SafeToDerefRegs(NumRegs), LastInstWritingReg(NumRegsToTrack) {}
+      : SafeToDerefRegs(NumRegs), TrustedRegs(NumRegs),
+        LastInstWritingReg(NumRegsToTrack) {}
 
   State &merge(const State &StateIn) {
     if (StateIn.empty())
@@ -219,6 +234,7 @@ struct State {
       return (*this = StateIn);
 
     SafeToDerefRegs &= StateIn.SafeToDerefRegs;
+    TrustedRegs &= StateIn.TrustedRegs;
     for (unsigned I = 0; I < LastInstWritingReg.size(); ++I)
       for (const MCInst *J : StateIn.LastInstWritingReg[I])
         LastInstWritingReg[I].insert(J);
@@ -231,6 +247,7 @@ struct State {
 
   bool operator==(const State &RHS) const {
     return SafeToDerefRegs == RHS.SafeToDerefRegs &&
+           TrustedRegs == RHS.TrustedRegs &&
            LastInstWritingReg == RHS.LastInstWritingReg;
   }
   bool operator!=(const State &RHS) const { return !((*this) == RHS); }
@@ -255,6 +272,7 @@ raw_ostream &operator<<(raw_ostream &OS, const State &S) {
     OS << "empty";
   } else {
     OS << "SafeToDerefRegs: " << S.SafeToDerefRegs << ", ";
+    OS << "TrustedRegs: " << S.TrustedRegs << ", ";
     printLastInsts(OS, S.LastInstWritingReg);
   }
   OS << ">";
@@ -275,11 +293,14 @@ void PacStatePrinter::print(raw_ostream &OS, const State &S) const {
   OS << "pacret-state<";
   if (S.empty()) {
     assert(S.SafeToDerefRegs.empty());
+    assert(S.TrustedRegs.empty());
     assert(S.LastInstWritingReg.empty());
     OS << "empty";
   } else {
     OS << "SafeToDerefRegs: ";
     RegStatePrinter.print(OS, S.SafeToDerefRegs);
+    OS << ", TrustedRegs: ";
+    RegStatePrinter.print(OS, S.TrustedRegs);
     OS << ", ";
     printLastInsts(OS, S.LastInstWritingReg);
   }
@@ -308,6 +329,17 @@ class PacRetAnalysis {
   /// RegToTrackInstsFor is the set of registers for which the dataflow analysis
   /// must compute which the last set of instructions writing to it are.
   const TrackedRegisters RegsToTrackInstsFor;
+  /// Stores information about the detected instruction sequences emitted to
+  /// check an authenticated pointer. Specifically, if such sequence is detected
+  /// in a basic block, it maps the last instruction of that basic block to
+  /// (CheckedRegister, FirstInstOfTheSequence) pair, see the description of
+  /// MCPlusBuilder::getAuthCheckedReg(BB) method.
+  ///
+  /// As the detection of such sequences requires iterating over the adjacent
+  /// instructions, it should be done before calling computeNext(), which
+  /// operates on separate instructions.
+  DenseMap<const MCInst *, std::pair<MCPhysReg, const MCInst *>>
+      CheckerSequenceInfo;
 
   SmallPtrSet<const MCInst *, 4> &lastWritingInsts(State &S,
                                                    MCPhysReg Reg) const {
@@ -322,8 +354,10 @@ class PacRetAnalysis {
 
   State createEntryState() {
     State S(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-    for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs())
-      S.SafeToDerefRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+    for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs()) {
+      S.TrustedRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+      S.SafeToDerefRegs = S.TrustedRegs;
+    }
     return S;
   }
 
@@ -370,6 +404,45 @@ class PacRetAnalysis {
     return Regs;
   }
 
+  // Returns all registers made trusted by this instruction.
+  SmallVector<MCPhysReg> getRegsMadeTrusted(const MCInst &Point,
+                                            const State &Cur) const {
+    SmallVector<MCPhysReg> Regs;
+    const MCPhysReg NoReg = BC.MIB->getNoRegister();
+
+    // An authenticated pointer can be checked, or
+    MCPhysReg CheckedReg =
+        BC.MIB->getAuthCheckedReg(Point, /*MayOverwrite=*/false);
+    if (CheckedReg != NoReg && Cur.SafeToDerefRegs[CheckedReg])
+      Regs.push_back(CheckedReg);
+
+    if (CheckerSequenceInfo.contains(&Point)) {
+      MCPhysReg CheckedReg;
+      const MCInst *FirstCheckerInst;
+      std::tie(CheckedReg, FirstCheckerInst) = CheckerSequenceInfo.at(&Point);
+
+      // FirstCheckerInst should belong to the same basic block, meaning
+      // it was deterministically processed a few steps before this instruction.
+      const State &StateBeforeChecker = getStateBefore(*FirstCheckerInst).get();
+      if (StateBeforeChecker.SafeToDerefRegs[CheckedReg])
+        Regs.push_back(CheckedReg);
+    }
+
+    // ... a safe address can be materialized, or
+    MCPhysReg NewAddrReg = BC.MIB->getMaterializedAddressRegForPtrAuth(Point);
+    if (NewAddrReg != NoReg)
+      Regs.push_back(NewAddrReg);
+
+    // ... an address can be updated in a safe manner, producing the result
+    // which is as trusted as the input address.
+    if (auto DstAndSrc = BC.MIB->analyzeAddressArithmeticsForPtrAuth(Point)) {
+      if (Cur.TrustedRegs[DstAndSrc->second])
+        Regs.push_back(DstAndSrc->first);
+    }
+
+    return Regs;
+  }
+
   State computeNext(const MCInst &Point, const State &Cur) {
     PacStatePrinter P(BC);
     LLVM_DEBUG({
@@ -396,11 +469,34 @@ class PacRetAnalysis {
     BitVector Clobbered = getClobberedRegs(Point);
     SmallVector<MCPhysReg> NewSafeToDerefRegs =
         getRegsMadeSafeToDeref(Point, Cur);
+    SmallVector<MCPhysReg> NewTrustedRegs = getRegsMadeTrusted(Point, Cur);
+
+    // Ideally, being trusted is a strictly stronger property than being
+    // safe-to-dereference. To simplify the computation of Next state, enforce
+    // this for NewSafeToDerefRegs and NewTrustedRegs. Additionally, this
+    // fixes the properly for "cumulative" register states in tricky cases
+    // like the following:
+    //
+    //    ; LR is safe to dereference here
+    //    mov   x16, x30  ; start of the sequence, LR is s-t-d right before
+    //    xpaclri         ; clobbers LR, LR is not safe anymore
+    //    cmp   x30, x16
+    //    b.eq  1f        ; end of the sequence: LR is marked as trusted
+    //    brk   0x1234
+    //  1:
+    //    ; at this point LR would be marked as trusted,
+    //    ; but not safe-to-dereference
+    //
+    for (auto TrustedReg : NewTrustedRegs) {
+      if (!is_contained(NewSafeToDerefRegs, TrustedReg))
+        NewSafeToDerefRegs.push_back(TrustedReg);
+    }
 
     // Then, compute the state after this instruction is executed.
     State Next = Cur;
 
     Next.SafeToDerefRegs.reset(Clobbered);
+    Next.TrustedRegs.reset(Clobbered);
     // Keep track of this instruction if it writes to any of the registers we
     // need to track that for:
     for (MCPhysReg Reg : RegsToTrackInstsFor.getRegisters())
@@ -421,6 +517,10 @@ class PacRetAnalysis {
         lastWritingInsts(Next, Reg).clear();
     }
 
+    // Process new trusted registers.
+    for (MCPhysReg TrustedReg : NewTrustedRegs)
+      Next.TrustedRegs |= BC.MIB->getAliases(TrustedReg, /*OnlySmaller=*/true);
+
     LLVM_DEBUG({
       dbgs() << "  .. result: (";
       P.print(dbgs(), Next);
@@ -476,7 +576,22 @@ class PacRetDFAnalysis
     return DFParent::getStateBefore(Inst);
   }
 
-  void run() override { DFParent::run(); }
+  void run() override {
+    for (BinaryBasicBlock &BB : Func) {
+      if (auto CheckerInfo = BC.MIB->getAuthCheckedReg(BB)) {
+        MCInst *LastInstOfChecker = BB.getLastNonPseudoInstr();
+        LLVM_DEBUG({
+          dbgs() << "Found pointer checking sequence in " << BB.getName()
+                 << ":\n";
+          traceReg(BC, "Checked register", CheckerInfo->first);
+          traceInst(BC, "First instruction", *CheckerInfo->second);
+          traceInst(BC, "Last instruction", *LastInstOfChecker);
+        });
+        CheckerSequenceInfo[LastInstOfChecker] = *CheckerInfo;
+      }
+    }
+    DFParent::run();
+  }
 
 protected:
   void preflight() {}
@@ -634,6 +749,26 @@ shouldReportCallGadget(const BinaryContext &BC, const MCInstReference &Inst,
   return std::make_shared<GadgetReport>(CallKind, Inst, DestReg);
 }
 
+static std::shared_ptr<Report>
+shouldReportSigningOracle(const BinaryContext &BC, const MCInstReference &Inst,
+                          const State &S) {
+  static const GadgetKind SigningOracleKind("signing oracle found");
+
+  MCPhysReg SignedReg = BC.MIB->getSignedReg(Inst);
+  if (SignedReg == BC.MIB->getNoRegister())
+    return nullptr;
+
+  LLVM_DEBUG({
+    traceInst(BC, "Found sign inst", Inst);
+    traceReg(BC, "Signed reg", SignedReg);
+    traceRegMask(BC, "TrustedRegs", S.TrustedRegs);
+  });
+  if (S.TrustedRegs[SignedReg])
+    return nullptr;
+
+  return std::make_shared<GadgetReport>(SigningOracleKind, Inst, SignedReg);
+}
+
 FunctionAnalysisResult
 Analysis::findGadgets(BinaryFunction &BF,
                       MCPlusBuilder::AllocatorIdTy AllocatorId) {
@@ -666,6 +801,8 @@ Analysis::findGadgets(BinaryFunction &BF,
 
     if (auto Report = shouldReportCallGadget(BC, Inst, S))
       Result.Diagnostics.push_back(Report);
+    if (auto Report = shouldReportSigningOracle(BC, Inst, S))
+      Result.Diagnostics.push_back(Report);
   });
   return Result;
 }
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 30ad628345dbf..7609e64a44ca0 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -256,6 +256,36 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     return AuthenticatedReg.getError() ? false : *AuthenticatedReg == Reg;
   }
 
+  MCPhysReg getSignedReg(const MCInst &Inst) const override {
+    switch (Inst.getOpcode()) {
+    case AArch64::PACIA:
+    case AArch64::PACIB:
+    case AArch64::PACDA:
+    case AArch64::PACDB:
+    case AArch64::PACIZA:
+    case AArch64::PACIZB:
+    case AArch64::PACDZA:
+    case AArch64::PACDZB:
+      return Inst.getOperand(0).getReg();
+    case AArch64::PACIAZ:
+    case AArch64::PACIBZ:
+    case AArch64::PACIASP:
+    case AArch64::PACIBSP:
+    case AArch64::PACIASPPC:
+    case AArch64::PACIBSPPC:
+    case AArch64::PACNBIASPPC:
+    case AArch64::PACNBIBSPPC:
+      return AArch64::LR;
+    case AArch64::PACIA1716:
+    case AArch64::PACIB1716:
+    case AArch64::PACIA171615:
+    case AArch64::PACIB171615:
+      return AArch64::X17;
+    default:
+      return getNoRegister();
+    }
+  }
+
   ErrorOr<MCPhysReg> getRegUsedAsRetDest(const MCInst &Inst) const override {
     assert(isReturn(Inst));
     switch (Inst.getOpcode()) {
@@ -341,6 +371,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
     }
   }
 
+  std::optional<std::pair<MCPhysReg, MCInst *>>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+    // Match several possible hard-coded sequences of instructions which can be
+    // emitted by LLVM backend to check that the authenticated pointer is
+    // correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+    //
+    // This function only matches sequences involving branch instructions.
+    // All these sequences have the form:
+    //
+    // (0) ... regular code that authenticates a pointer in Xn ...
+    // (1) analyze Xn
+    // (2) branch to .Lon_success if the pointer is correct
+    // (3) BRK #imm (fall-through basic block)
+    //
+    // In the above pseudocode, (1) + (2) is one of the following sequences:
+    //
+    // - eor Xtmp, Xn, Xn, lsl #1
+    //   tbz Xtmp, #62, .Lon_success
+    //
+    // - mov Xtmp, Xn
+    //   xpac(i|d) Xn (or xpaclri if Xn is LR)
+    //   cmp Xtmp, Xn
+    //   b.eq .Lon_success
+    //
+    // Note that any branch destination operand is accepted as .Lon_success -
+    // it is the responsibility of the caller of getAuthCheckedReg to inspect
+    // the list of successors of this basic block as appropriate.
+
+    // Any of the above code sequences assume the fall-through basic block
+    // is a dead-end BRK instruction (any immediate operand is accepted).
+    const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+    if (!BreakBB || BreakBB->empty() ||
+        BreakBB->front().getOpcode() != AArch64::BRK)
+      return std::nullopt;
+
+    // Iterate over the instructions of BB in reverse order, matching opcodes
+    // and operands.
+    MCPhysReg TestedReg = 0;
+    MCPhysReg ScratchReg = 0;
+    auto It = BB.end();
+    auto StepAndGetOpcode = [&It, &BB]() -> int {
+      if (It == BB.begin())
+        return -1;
+      --It;
+      return It->getOpcode();
+    };
+
+    switch (StepAndGetOpcode()) {
+    default:
+      // Not matched the branch instruction.
+      return std::nullopt;
+    case AArch64::Bcc:
+      // Bcc EQ, .Lon_success
+      if (It->getOperand(0).getImm() != AArch64CC::EQ)
+        return std::nullopt;
+      // Not checking .Lon_success (see above).
+
+      // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+      if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+          It->getOperand(0).getReg() != AArch64::XZR ||
+          It->getOperand(3).getImm() != 0)
+        return std::nullopt;
+      TestedReg = It->getOperand(1).getReg();
+      ScratchReg = It->getOperand(2).getReg();
+
+      // Either XPAC(I|D) ScratchReg, ScratchReg
+      // or     XPACLRI
+      switch (StepAndGetOpcode()) {
+      default:
+        return std::nullopt;
+      case AArch64::XPACLRI:
+        // No operands to check, but using XPACLRI forces TestedReg to be X30.
+        if (TestedReg != AArch64::LR)
+          return std::nullopt;
+        break;
+      case AArch64::XPACI:
+      case AArch64::XPACD:
+        if (It->getOperand(0).getReg() != ScratchReg ||
+            It->getOperand(1).getReg() != ScratchReg)
+          return std::nullopt;
+        break;
+      }
+
+      // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+      if (StepAndGetOpcode() != AArch64::ORRXrs)
+        return std::nullopt;
+      if (It->getOperand(0).getReg() != ScratchReg ||
+          It->getOperand(1).getReg() != AArch64::XZR ||
+          It->getOperand(2).getReg() != TestedReg ||
+          It->getOperand(3).getImm() != 0)
+        return std::nullopt;
+
+      return std::make_pair(TestedReg, &*It);
+
+    case AArch64::TBZX:
+      // TBZX ScratchReg, 62, .Lon_success
+      ScratchReg = It->getOperand(0).getReg();
+      if (It->getOperand(1).getImm() != 62)
+        return std::nullopt;
+      // Not checking .Lon_success (see above).
+
+      // EORXrs ScratchReg, TestedReg, TestedReg, 1
+      if (StepAndGetOpcode() != AArch64::EORXrs)
+        return std::nullopt;
+      TestedReg = It->getOperand(1).getReg();
+      if (It->getOperand(0).getReg() != ScratchReg ||
+          It->getOperand(2).getReg() != TestedReg ||
+          It->getOperand(3).getImm() != 1)
+        return std::nullopt;
+
+      return std::make_pair(TestedReg, &*It);
+    }
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+                              bool MayOverwrite) const override {
+    // Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+    // method as it accepts an instance of MachineInstr, not MCInst.
+    const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+    // If signing oracles are considered, the particular value left in the base
+    // register after this instruction is important. This function checks that
+    // if the base register was overwritten, it is due to address write-back.
+    //
+    // Note that this function is not needed for authentication oracles, as the
+    // particular value left in the register after a successful memory access
+    // is not important.
+    auto ClobbersBaseRegExceptWriteback = [&](unsigned BaseRegUseIndex) {
+      MCPhysReg BaseReg = Inst.getOperand(BaseRegUseIndex).getReg();
+      unsigned WrittenBackDefIndex = Desc.getOperandConstraint(
+          BaseRegUseIndex, MCOI::OperandCons...
[truncated]

@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-nocfg-analysis branch from 49981c9 to de917fb Compare April 3, 2025 13:43
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-signing-oracles branch from 826e9d7 to 5a05ce4 Compare April 3, 2025 13:43
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-nocfg-analysis branch from de917fb to cf3e913 Compare April 3, 2025 15:01
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-signing-oracles branch from 5a05ce4 to 89af763 Compare April 3, 2025 15:01
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-nocfg-analysis branch from cf3e913 to f2f0522 Compare April 4, 2025 16:27
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-signing-oracles branch from 89af763 to 957df88 Compare April 4, 2025 16:27
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-nocfg-analysis branch from f2f0522 to b345663 Compare April 7, 2025 10:15
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-signing-oracles branch from 957df88 to 96b8b4d Compare April 7, 2025 10:15
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-signing-oracles branch from 56ea6bc to 57ca35a Compare April 22, 2025 16:08
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-nocfg-analysis branch from 168488e to 2295e97 Compare April 22, 2025 16:08
Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for only reviewing piece-meal. I'm struggling a bit at the time to reserve longer blocks of time to review this fully in one go.
I hope my comments still make sense though....

Copy link
Contributor Author

@atrosinenko atrosinenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbeyls Thank you for the comments! I have updated code comments accordingly.

Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally managed to read through this patch end-to-end.
I only have 3 very nit-picky questions left.
This looks almost ready to merge.

Copy link
Collaborator

@kbeyls kbeyls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience with my many questions!
This looks good to merge to me now.

Implement the detection of signing oracles. In this patch, a signing
oracle is defined as a sign instruction that accepts a "non-protected"
pointer, but for a slightly different definition of "non-protected"
compared to control flow instructions.

A second BitVector named TrustedRegs is added to the register state
computed by the data-flow analysis. The difference between a
"safe-to-dereference" and a "trusted" register states is that to make
an unsafe register trusted by authentication, one has to make sure
that the authentication succeeded. For example, on AArch64 without
FEAT_PAuth2 and FEAT_EPAC, an authentication instruction produces an
invalid pointer on failure, so that subsequent memory access triggers
an error, but re-signing such pointer would "fix" the signature.

Note that while a separate "trusted" register state may be redundant
depending on the specific semantics of auth and sign operations, it is
still important to check signing operations: while code like this

    resign:
      autda x0, x1
      pacda x0, x2
      ret

is probably safe provided `autda` generates an error on authentication
failure, this function

    sign_anything:
      pacda x0, x1
      ret

is inherently unsafe.
@atrosinenko atrosinenko force-pushed the users/atrosinenko/bolt-gs-signing-oracles branch from 66db728 to 19698f7 Compare May 20, 2025 10:03
@atrosinenko atrosinenko merged commit 48a2836 into main May 20, 2025
10 checks passed
@atrosinenko atrosinenko deleted the users/atrosinenko/bolt-gs-signing-oracles branch May 20, 2025 10:42
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 20, 2025

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building bolt at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/32298

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
...
PASS: LLVM :: TableGen/GlobalISelCombinerEmitter/match-table-variadics.td (6179 of 99678)
PASS: Clang :: Index/complete-block-property-assignment.m (6180 of 99678)
PASS: Clang :: CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/overloaded/vfdiv.c (6181 of 99678)
PASS: Clang Tools :: clang-tidy/checkers/bugprone/chained-comparison.cpp (6182 of 99678)
PASS: Clang :: OpenMP/declare_variant_device_isa_codegen_1.c (6183 of 99678)
PASS: LLVM :: DebugInfo/X86/prototyped.ll (6184 of 99678)
PASS: Clang :: SemaHLSL/Language/TemplateOutArg.hlsl (6185 of 99678)
PASS: Clang :: Driver/mips-img-v2.cpp (6186 of 99678)
PASS: Clang Tools :: clang-tidy/checkers/readability/redundant-access-specifiers.cpp (6187 of 99678)
TIMEOUT: SanitizerCommon-tsan-x86_64-Linux :: Posix/fork_threaded.c (6188 of 99678)
******************** TEST 'SanitizerCommon-tsan-x86_64-Linux :: Posix/fork_threaded.c' FAILED ********************
Exit Code: -9
Timeout: Reached timeout of 60 seconds

Command Output (stderr):
--
/build/buildbot/premerge-monolithic-linux/build/./bin/clang  -gline-tables-only -fsanitize=thread  -m64 -funwind-tables  -I/build/buildbot/premerge-monolithic-linux/llvm-project/compiler-rt/test -ldl -O0 /build/buildbot/premerge-monolithic-linux/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Posix/fork_threaded.c -o /build/buildbot/premerge-monolithic-linux/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/tsan-x86_64-Linux/Posix/Output/fork_threaded.c.tmp && env TSAN_OPTIONS=die_after_fork=0  /build/buildbot/premerge-monolithic-linux/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/tsan-x86_64-Linux/Posix/Output/fork_threaded.c.tmp # RUN: at line 1
+ /build/buildbot/premerge-monolithic-linux/build/./bin/clang -gline-tables-only -fsanitize=thread -m64 -funwind-tables -I/build/buildbot/premerge-monolithic-linux/llvm-project/compiler-rt/test -ldl -O0 /build/buildbot/premerge-monolithic-linux/llvm-project/compiler-rt/test/sanitizer_common/TestCases/Posix/fork_threaded.c -o /build/buildbot/premerge-monolithic-linux/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/tsan-x86_64-Linux/Posix/Output/fork_threaded.c.tmp
+ env TSAN_OPTIONS=die_after_fork=0 /build/buildbot/premerge-monolithic-linux/build/runtimes/runtimes-bins/compiler-rt/test/sanitizer_common/tsan-x86_64-Linux/Posix/Output/fork_threaded.c.tmp

--

********************
PASS: Clang :: OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp (6189 of 99678)
PASS: MemorySanitizer-X86_64 :: Linux/pthread_getaffinity_np.cpp (6190 of 99678)
PASS: LLVM :: CodeGen/RISCV/rvv/vsseg-rv64.ll (6191 of 99678)
PASS: Clang :: CodeGen/lto-newpm-pipeline.c (6192 of 99678)
PASS: UBSan-ThreadSanitizer-lld-x86_64 :: TestCases/Misc/nonnull-arg.cpp (6193 of 99678)
PASS: UBSan-AddressSanitizer-x86_64 :: TestCases/TypeCheck/vptr-non-unique-typeinfo.cpp (6194 of 99678)
PASS: Clang Tools :: clang-tidy/checkers/bugprone/tagged-union-member-count.cpp (6195 of 99678)
PASS: UBSan-ThreadSanitizer-x86_64 :: TestCases/TypeCheck/vptr-non-unique-typeinfo.cpp (6196 of 99678)
PASS: LLVM :: CodeGen/AMDGPU/llvm.amdgcn.image.a16.encode.ll (6197 of 99678)
PASS: LLVM :: TableGen/RegisterEncoder.td (6198 of 99678)
PASS: MLIR :: Dialect/Transform/test-interpreter.mlir (6199 of 99678)
PASS: Clang :: CodeGen/RISCV/rvv-intrinsics-autogenerated/policy/overloaded/vloxseg2ei8.c (6200 of 99678)
PASS: LLVM :: CodeGen/AMDGPU/fnearbyint.ll (6201 of 99678)
PASS: NSan-X86_64 :: new_delete_test.cpp (6202 of 99678)
PASS: ThreadSanitizer-x86_64 :: java_alloc.cpp (6203 of 99678)
PASS: LLVM :: tools/llvm-exegesis/RISCV/rvv/valid-sew.test (6204 of 99678)
PASS: Clang :: CodeGen/arm-mve-intrinsics/vcvt.c (6205 of 99678)
PASS: Clang :: Driver/armv8.1m.main.s (6206 of 99678)
PASS: Clang Tools :: clang-tidy/checkers/bugprone/multiple-statement-macro.cpp (6207 of 99678)
PASS: AddressSanitizer-x86_64-linux-dynamic :: TestCases/initialization-bug.cpp (6208 of 99678)
PASS: lld :: ELF/version-script.s (6209 of 99678)
PASS: AddressSanitizer-Unit :: ./Asan-x86_64-inline-Noinst-Test/11/16 (6210 of 99678)
PASS: cfi-devirt-lld-x86_64 :: cross-dso/target_out_of_bounds.cpp (6211 of 99678)
PASS: Flang :: Driver/color-diagnostics-forwarding.f90 (6212 of 99678)
PASS: lld :: COFF/arm64x-loadconfig.s (6213 of 99678)
PASS: LLVM :: TableGen/VarLenEncoderHwModes.td (6214 of 99678)

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
Implement the detection of signing oracles. In this patch, a signing
oracle is defined as a sign instruction that accepts a "non-protected"
pointer, but for a slightly different definition of "non-protected"
compared to control flow instructions.

A second BitVector named TrustedRegs is added to the register state
computed by the data-flow analysis. The difference between a
"safe-to-dereference" and a "trusted" register states is that to make
an unsafe register trusted by authentication, one has to make sure
that the authentication succeeded. For example, on AArch64 without
FEAT_PAuth2 and FEAT_EPAC, an authentication instruction produces an
invalid pointer on failure, so that subsequent memory access triggers
an error, but re-signing such pointer would "fix" the signature.

Note that while a separate "trusted" register state may be redundant
depending on the specific semantics of auth and sign operations, it is
still important to check signing operations: while code like this

    resign:
      autda x0, x1
      pacda x0, x2
      ret

is probably safe provided `autda` generates an error on authentication
failure, this function

    sign_anything:
      pacda x0, x1
      ret

is inherently unsafe.
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 6, 2025
Implement the detection of signing oracles. In this patch, a signing
oracle is defined as a sign instruction that accepts a "non-protected"
pointer, but for a slightly different definition of "non-protected"
compared to control flow instructions.

A second BitVector named TrustedRegs is added to the register state
computed by the data-flow analysis. The difference between a
"safe-to-dereference" and a "trusted" register states is that to make
an unsafe register trusted by authentication, one has to make sure
that the authentication succeeded. For example, on AArch64 without
FEAT_PAuth2 and FEAT_EPAC, an authentication instruction produces an
invalid pointer on failure, so that subsequent memory access triggers
an error, but re-signing such pointer would "fix" the signature.

Note that while a separate "trusted" register state may be redundant
depending on the specific semantics of auth and sign operations, it is
still important to check signing operations: while code like this

    resign:
      autda x0, x1
      pacda x0, x2
      ret

is probably safe provided `autda` generates an error on authentication
failure, this function

    sign_anything:
      pacda x0, x1
      ret

is inherently unsafe.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants