-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[CodeGen] Inline stack guard check on Windows #136290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-debuginfo @llvm/pr-subscribers-backend-aarch64 Author: Omair Javaid (omjavaid) ChangesThis patch optimizes the Windows security cookie check mechanism by moving the comparison inline and only calling __security_check_cookie when the check fails. This reduces the overhead of making a DLL call for every function return. Previously, we implemented this optimization through a dedicated pass (X86WinFixupBufferSecurityCheckPass) in PR #95904, but this solution appears generic and efficient. We have reverted that pass in favor of this new approach. X86WinFixupBufferSecurityCheckPass is a machine-level pass that:
This new approach is simpler and more maintainable:
We will abandon the AArch64 specific implementation of AArch64WinFixupBufferSecurityCheck pass in PR #121938 in favor of this more general solution. Note: This solution is inspired from #121938 (comment) by @tamaspetz Patch is 34.53 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136290.diff 11 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
index 8cae34d06c8ba..f2265c7f68e12 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
@@ -3059,33 +3059,6 @@ void SelectionDAGBuilder::visitSPDescriptorParent(StackProtectorDescriptor &SPD,
if (TLI.useStackGuardXorFP())
GuardVal = TLI.emitStackGuardXorFP(DAG, GuardVal, dl);
- // Retrieve guard check function, nullptr if instrumentation is inlined.
- if (const Function *GuardCheckFn = TLI.getSSPStackGuardCheck(M)) {
- // The target provides a guard check function to validate the guard value.
- // Generate a call to that function with the content of the guard slot as
- // argument.
- FunctionType *FnTy = GuardCheckFn->getFunctionType();
- assert(FnTy->getNumParams() == 1 && "Invalid function signature");
-
- TargetLowering::ArgListTy Args;
- TargetLowering::ArgListEntry Entry;
- Entry.Node = GuardVal;
- Entry.Ty = FnTy->getParamType(0);
- if (GuardCheckFn->hasParamAttribute(0, Attribute::AttrKind::InReg))
- Entry.IsInReg = true;
- Args.push_back(Entry);
-
- TargetLowering::CallLoweringInfo CLI(DAG);
- CLI.setDebugLoc(getCurSDLoc())
- .setChain(DAG.getEntryNode())
- .setCallee(GuardCheckFn->getCallingConv(), FnTy->getReturnType(),
- getValue(GuardCheckFn), std::move(Args));
-
- std::pair<SDValue, SDValue> Result = TLI.LowerCallTo(CLI);
- DAG.setRoot(Result.second);
- return;
- }
-
// If useLoadStackGuardNode returns true, generate LOAD_STACK_GUARD.
// Otherwise, emit a volatile load to retrieve the stack guard value.
SDValue Chain = DAG.getEntryNode();
@@ -3126,14 +3099,66 @@ void SelectionDAGBuilder::visitSPDescriptorParent(StackProtectorDescriptor &SPD,
/// For a high level explanation of how this fits into the stack protector
/// generation see the comment on the declaration of class
/// StackProtectorDescriptor.
-void
-SelectionDAGBuilder::visitSPDescriptorFailure(StackProtectorDescriptor &SPD) {
+void SelectionDAGBuilder::visitSPDescriptorFailure(
+ StackProtectorDescriptor &SPD) {
+
const TargetLowering &TLI = DAG.getTargetLoweringInfo();
- TargetLowering::MakeLibCallOptions CallOptions;
- CallOptions.setDiscardResult(true);
- SDValue Chain = TLI.makeLibCall(DAG, RTLIB::STACKPROTECTOR_CHECK_FAIL,
- MVT::isVoid, {}, CallOptions, getCurSDLoc())
- .second;
+ MachineBasicBlock *ParentBB = SPD.getParentMBB();
+ const Module &M = *ParentBB->getParent()->getFunction().getParent();
+ SDValue Chain;
+
+ // Retrieve guard check function, nullptr if instrumentation is inlined.
+ if (const Function *GuardCheckFn = TLI.getSSPStackGuardCheck(M)) {
+
+ // First create the loads to the guard/stack slot for the comparison.
+ EVT PtrTy = TLI.getPointerTy(DAG.getDataLayout());
+ EVT PtrMemTy = TLI.getPointerMemTy(DAG.getDataLayout());
+
+ MachineFrameInfo &MFI = ParentBB->getParent()->getFrameInfo();
+ int FI = MFI.getStackProtectorIndex();
+
+ SDLoc dl = getCurSDLoc();
+ SDValue StackSlotPtr = DAG.getFrameIndex(FI, PtrTy);
+ Align Align = DAG.getDataLayout().getPrefTypeAlign(
+ PointerType::get(M.getContext(), 0));
+
+ // Generate code to load the content of the guard slot.
+ SDValue GuardVal = DAG.getLoad(
+ PtrMemTy, dl, DAG.getEntryNode(), StackSlotPtr,
+ MachinePointerInfo::getFixedStack(DAG.getMachineFunction(), FI), Align,
+ MachineMemOperand::MOVolatile);
+
+ if (TLI.useStackGuardXorFP())
+ GuardVal = TLI.emitStackGuardXorFP(DAG, GuardVal, dl);
+
+ // The target provides a guard check function to validate the guard value.
+ // Generate a call to that function with the content of the guard slot as
+ // argument.
+ FunctionType *FnTy = GuardCheckFn->getFunctionType();
+ assert(FnTy->getNumParams() == 1 && "Invalid function signature");
+
+ TargetLowering::ArgListTy Args;
+ TargetLowering::ArgListEntry Entry;
+ Entry.Node = GuardVal;
+ Entry.Ty = FnTy->getParamType(0);
+ if (GuardCheckFn->hasParamAttribute(0, Attribute::AttrKind::InReg))
+ Entry.IsInReg = true;
+ Args.push_back(Entry);
+
+ TargetLowering::CallLoweringInfo CLI(DAG);
+ CLI.setDebugLoc(getCurSDLoc())
+ .setChain(DAG.getEntryNode())
+ .setCallee(GuardCheckFn->getCallingConv(), FnTy->getReturnType(),
+ getValue(GuardCheckFn), std::move(Args));
+
+ Chain = TLI.LowerCallTo(CLI).second;
+ } else {
+ TargetLowering::MakeLibCallOptions CallOptions;
+ CallOptions.setDiscardResult(true);
+ Chain = TLI.makeLibCall(DAG, RTLIB::STACKPROTECTOR_CHECK_FAIL, MVT::isVoid,
+ {}, CallOptions, getCurSDLoc())
+ .second;
+ }
// Emit a trap instruction if we are required to do so.
const TargetOptions &TargetOpts = DAG.getTarget().Options;
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 598de6b207754..2a75dd7d8419b 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -1881,12 +1881,9 @@ void SelectionDAGISel::SelectAllBasicBlocks(const Function &Fn) {
FastIS->recomputeInsertPt();
}
- if (SP->shouldEmitSDCheck(*LLVMBB)) {
- bool FunctionBasedInstrumentation =
- TLI->getSSPStackGuardCheck(*Fn.getParent());
+ if (SP->shouldEmitSDCheck(*LLVMBB))
SDB->SPDescriptor.initialize(LLVMBB, FuncInfo->getMBB(LLVMBB),
- FunctionBasedInstrumentation);
- }
+ false);
if (Begin != BI)
++NumDAGBlocks;
@@ -1948,24 +1945,7 @@ SelectionDAGISel::FinishBasicBlock() {
PHI.addReg(FuncInfo->PHINodesToUpdate[i].second).addMBB(FuncInfo->MBB);
}
- // Handle stack protector.
- if (SDB->SPDescriptor.shouldEmitFunctionBasedCheckStackProtector()) {
- // The target provides a guard check function. There is no need to
- // generate error handling code or to split current basic block.
- MachineBasicBlock *ParentMBB = SDB->SPDescriptor.getParentMBB();
-
- // Add load and check to the basicblock.
- FuncInfo->MBB = ParentMBB;
- FuncInfo->InsertPt =
- findSplitPointForStackProtector(ParentMBB, *TII);
- SDB->visitSPDescriptorParent(SDB->SPDescriptor, ParentMBB);
- CurDAG->setRoot(SDB->getRoot());
- SDB->clear();
- CodeGenAndEmitDAG();
-
- // Clear the Per-BB State.
- SDB->SPDescriptor.resetPerBBState();
- } else if (SDB->SPDescriptor.shouldEmitStackProtector()) {
+ if (SDB->SPDescriptor.shouldEmitStackProtector()) {
MachineBasicBlock *ParentMBB = SDB->SPDescriptor.getParentMBB();
MachineBasicBlock *SuccessMBB = SDB->SPDescriptor.getSuccessMBB();
@@ -1979,8 +1959,7 @@ SelectionDAGISel::FinishBasicBlock() {
findSplitPointForStackProtector(ParentMBB, *TII);
// Splice the terminator of ParentMBB into SuccessMBB.
- SuccessMBB->splice(SuccessMBB->end(), ParentMBB,
- SplitPoint,
+ SuccessMBB->splice(SuccessMBB->end(), ParentMBB, SplitPoint,
ParentMBB->end());
// Add compare/jump on neq/jump to the parent BB.
diff --git a/llvm/lib/Target/X86/CMakeLists.txt b/llvm/lib/Target/X86/CMakeLists.txt
index 9553a8619feb5..44a54c8ec62cb 100644
--- a/llvm/lib/Target/X86/CMakeLists.txt
+++ b/llvm/lib/Target/X86/CMakeLists.txt
@@ -83,7 +83,6 @@ set(sources
X86TargetTransformInfo.cpp
X86VZeroUpper.cpp
X86WinEHState.cpp
- X86WinFixupBufferSecurityCheck.cpp
X86InsertWait.cpp
GISel/X86CallLowering.cpp
GISel/X86InstructionSelector.cpp
diff --git a/llvm/lib/Target/X86/X86.h b/llvm/lib/Target/X86/X86.h
index 48a3fe1934a96..98faae9f14dbc 100644
--- a/llvm/lib/Target/X86/X86.h
+++ b/llvm/lib/Target/X86/X86.h
@@ -73,9 +73,6 @@ FunctionPass *createX86OptimizeLEAs();
/// Return a pass that transforms setcc + movzx pairs into xor + setcc.
FunctionPass *createX86FixupSetCC();
-/// Return a pass that transform inline buffer security check into seperate bb
-FunctionPass *createX86WinFixupBufferSecurityCheckPass();
-
/// Return a pass that avoids creating store forward block issues in the hardware.
FunctionPass *createX86AvoidStoreForwardingBlocks();
@@ -190,7 +187,6 @@ void initializeX86ExpandPseudoPass(PassRegistry &);
void initializeX86FastPreTileConfigPass(PassRegistry &);
void initializeX86FastTileConfigPass(PassRegistry &);
void initializeX86FixupSetCCPassPass(PassRegistry &);
-void initializeX86WinFixupBufferSecurityCheckPassPass(PassRegistry &);
void initializeX86FlagsCopyLoweringPassPass(PassRegistry &);
void initializeX86LoadValueInjectionLoadHardeningPassPass(PassRegistry &);
void initializeX86LoadValueInjectionRetHardeningPassPass(PassRegistry &);
diff --git a/llvm/lib/Target/X86/X86TargetMachine.cpp b/llvm/lib/Target/X86/X86TargetMachine.cpp
index 4cecbbf27aa30..c5e76b1c38d26 100644
--- a/llvm/lib/Target/X86/X86TargetMachine.cpp
+++ b/llvm/lib/Target/X86/X86TargetMachine.cpp
@@ -552,7 +552,6 @@ bool X86PassConfig::addPreISel() {
void X86PassConfig::addPreRegAlloc() {
if (getOptLevel() != CodeGenOptLevel::None) {
addPass(&LiveRangeShrinkID);
- addPass(createX86WinFixupBufferSecurityCheckPass());
addPass(createX86FixupSetCC());
addPass(createX86OptimizeLEAs());
addPass(createX86CallFrameOptimization());
diff --git a/llvm/lib/Target/X86/X86WinFixupBufferSecurityCheck.cpp b/llvm/lib/Target/X86/X86WinFixupBufferSecurityCheck.cpp
deleted file mode 100644
index 5c12af1fee637..0000000000000
--- a/llvm/lib/Target/X86/X86WinFixupBufferSecurityCheck.cpp
+++ /dev/null
@@ -1,245 +0,0 @@
-//===- X86WinFixupBufferSecurityCheck.cpp Fix Buffer Security Check Call -===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-// Buffer Security Check implementation inserts windows specific callback into
-// code. On windows, __security_check_cookie call gets call everytime function
-// is return without fixup. Since this function is defined in runtime library,
-// it incures cost of call in dll which simply does comparison and returns most
-// time. With Fixup, We selective move to call in DLL only if comparison fails.
-//===----------------------------------------------------------------------===//
-
-#include "X86.h"
-#include "X86FrameLowering.h"
-#include "X86InstrInfo.h"
-#include "X86Subtarget.h"
-#include "llvm/CodeGen/LivePhysRegs.h"
-#include "llvm/CodeGen/MachineFunctionPass.h"
-#include "llvm/CodeGen/MachineInstrBuilder.h"
-#include "llvm/IR/Module.h"
-
-using namespace llvm;
-
-#define DEBUG_TYPE "x86-win-fixup-bscheck"
-
-namespace {
-
-class X86WinFixupBufferSecurityCheckPass : public MachineFunctionPass {
-public:
- static char ID;
-
- X86WinFixupBufferSecurityCheckPass() : MachineFunctionPass(ID) {}
-
- StringRef getPassName() const override {
- return "X86 Windows Fixup Buffer Security Check";
- }
-
- bool runOnMachineFunction(MachineFunction &MF) override;
-
- std::pair<MachineBasicBlock *, MachineInstr *>
- getSecurityCheckerBasicBlock(MachineFunction &MF);
-
- void getGuardCheckSequence(MachineBasicBlock *CurMBB, MachineInstr *CheckCall,
- MachineInstr *SeqMI[5]);
-
- void SplitBasicBlock(MachineBasicBlock *CurMBB, MachineBasicBlock *NewRetMBB,
- MachineBasicBlock::iterator SplitIt);
-
- void FinishBlock(MachineBasicBlock *MBB);
-
- void FinishFunction(MachineBasicBlock *FailMBB, MachineBasicBlock *NewRetMBB);
-
- std::pair<MachineInstr *, MachineInstr *>
- CreateFailCheckSequence(MachineBasicBlock *CurMBB, MachineBasicBlock *FailMBB,
- MachineInstr *SeqMI[5]);
-};
-} // end anonymous namespace
-
-char X86WinFixupBufferSecurityCheckPass::ID = 0;
-
-INITIALIZE_PASS(X86WinFixupBufferSecurityCheckPass, DEBUG_TYPE, DEBUG_TYPE,
- false, false)
-
-FunctionPass *llvm::createX86WinFixupBufferSecurityCheckPass() {
- return new X86WinFixupBufferSecurityCheckPass();
-}
-
-void X86WinFixupBufferSecurityCheckPass::SplitBasicBlock(
- MachineBasicBlock *CurMBB, MachineBasicBlock *NewRetMBB,
- MachineBasicBlock::iterator SplitIt) {
- NewRetMBB->splice(NewRetMBB->end(), CurMBB, SplitIt, CurMBB->end());
-}
-
-std::pair<MachineBasicBlock *, MachineInstr *>
-X86WinFixupBufferSecurityCheckPass::getSecurityCheckerBasicBlock(
- MachineFunction &MF) {
- MachineBasicBlock::reverse_iterator RBegin, REnd;
-
- for (auto &MBB : llvm::reverse(MF)) {
- for (RBegin = MBB.rbegin(), REnd = MBB.rend(); RBegin != REnd; ++RBegin) {
- auto &MI = *RBegin;
- if (MI.getOpcode() == X86::CALL64pcrel32 &&
- MI.getNumExplicitOperands() == 1) {
- auto MO = MI.getOperand(0);
- if (MO.isGlobal()) {
- auto Callee = dyn_cast<Function>(MO.getGlobal());
- if (Callee && Callee->getName() == "__security_check_cookie") {
- return std::make_pair(&MBB, &MI);
- break;
- }
- }
- }
- }
- }
- return std::make_pair(nullptr, nullptr);
-}
-
-void X86WinFixupBufferSecurityCheckPass::getGuardCheckSequence(
- MachineBasicBlock *CurMBB, MachineInstr *CheckCall,
- MachineInstr *SeqMI[5]) {
-
- MachineBasicBlock::iterator UIt(CheckCall);
- MachineBasicBlock::reverse_iterator DIt(CheckCall);
- // Seq From StackUp to Stack Down Is fixed.
- // ADJCALLSTACKUP64
- ++UIt;
- SeqMI[4] = &*UIt;
-
- // CALL __security_check_cookie
- SeqMI[3] = CheckCall;
-
- // COPY function slot cookie
- ++DIt;
- SeqMI[2] = &*DIt;
-
- // ADJCALLSTACKDOWN64
- ++DIt;
- SeqMI[1] = &*DIt;
-
- MachineBasicBlock::reverse_iterator XIt(SeqMI[1]);
- for (; XIt != CurMBB->rbegin(); ++XIt) {
- auto &CI = *XIt;
- if ((CI.getOpcode() == X86::XOR64_FP) || (CI.getOpcode() == X86::XOR32_FP))
- break;
- }
- SeqMI[0] = &*XIt;
-}
-
-std::pair<MachineInstr *, MachineInstr *>
-X86WinFixupBufferSecurityCheckPass::CreateFailCheckSequence(
- MachineBasicBlock *CurMBB, MachineBasicBlock *FailMBB,
- MachineInstr *SeqMI[5]) {
-
- auto MF = CurMBB->getParent();
-
- Module &M = *MF->getFunction().getParent();
- GlobalVariable *GV = M.getGlobalVariable("__security_cookie");
- assert(GV && " Security Cookie was not installed!");
-
- const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();
-
- MachineInstr *GuardXor = SeqMI[0];
- MachineBasicBlock::iterator InsertPt(GuardXor);
- ++InsertPt;
-
- // Compare security_Cookie with XOR_Val, if not same, we have violation
- auto CMI = BuildMI(*CurMBB, InsertPt, DebugLoc(), TII->get(X86::CMP64rm))
- .addReg(GuardXor->getOperand(0).getReg())
- .addReg(X86::RIP)
- .addImm(1)
- .addReg(X86::NoRegister)
- .addGlobalAddress(GV)
- .addReg(X86::NoRegister);
-
- BuildMI(*CurMBB, InsertPt, DebugLoc(), TII->get(X86::JCC_1))
- .addMBB(FailMBB)
- .addImm(X86::COND_NE);
-
- auto JMI = BuildMI(*CurMBB, InsertPt, DebugLoc(), TII->get(X86::JMP_1));
-
- return std::make_pair(CMI.getInstr(), JMI.getInstr());
-}
-
-void X86WinFixupBufferSecurityCheckPass::FinishBlock(MachineBasicBlock *MBB) {
- LivePhysRegs LiveRegs;
- computeAndAddLiveIns(LiveRegs, *MBB);
-}
-
-void X86WinFixupBufferSecurityCheckPass::FinishFunction(
- MachineBasicBlock *FailMBB, MachineBasicBlock *NewRetMBB) {
- FailMBB->getParent()->RenumberBlocks();
- // FailMBB includes call to MSCV RT where is __security_check_cookie
- // function is called. This function uses regcall and it expects cookie
- // value from stack slot.( even if this is modified)
- // Before going further we compute back livein for this block to make sure
- // it is live and provided.
- FinishBlock(FailMBB);
- FinishBlock(NewRetMBB);
-}
-
-bool X86WinFixupBufferSecurityCheckPass::runOnMachineFunction(
- MachineFunction &MF) {
- bool Changed = false;
- const X86Subtarget &STI = MF.getSubtarget<X86Subtarget>();
-
- if (!(STI.isTargetWindowsItanium() || STI.isTargetWindowsMSVC()))
- return Changed;
-
- // Check if security cookie was installed or not
- Module &M = *MF.getFunction().getParent();
- GlobalVariable *GV = M.getGlobalVariable("__security_cookie");
- if (!GV)
- return Changed;
-
- const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-
- // Check if security check cookie was installed or not
- auto [CurMBB, CheckCall] = getSecurityCheckerBasicBlock(MF);
-
- if (!CheckCall)
- return Changed;
-
- MachineBasicBlock *FailMBB = MF.CreateMachineBasicBlock();
- MachineBasicBlock *NewRetMBB = MF.CreateMachineBasicBlock();
-
- MF.insert(MF.end(), NewRetMBB);
- MF.insert(MF.end(), FailMBB);
-
- MachineInstr *SeqMI[5];
- getGuardCheckSequence(CurMBB, CheckCall, SeqMI);
- // MachineInstr * GuardXor = SeqMI[0];
-
- auto FailSeqRange = CreateFailCheckSequence(CurMBB, FailMBB, SeqMI);
- MachineInstrBuilder JMI(MF, FailSeqRange.second);
-
- // After Inserting JMP_1, we can not have two terminators
- // in same block, split CurrentMBB after JMP_1
- MachineBasicBlock::iterator SplitIt(SeqMI[4]);
- ++SplitIt;
- SplitBasicBlock(CurMBB, NewRetMBB, SplitIt);
-
- // Fill up Failure Routine, move Fail Check Squence from CurMBB to FailMBB
- MachineBasicBlock::iterator U1It(SeqMI[1]);
- MachineBasicBlock::iterator U2It(SeqMI[4]);
- ++U2It;
- FailMBB->splice(FailMBB->end(), CurMBB, U1It, U2It);
- BuildMI(*FailMBB, FailMBB->end(), DebugLoc(), TII->get(X86::INT3));
-
- // Move left over instruction after StackUp
- // from Current Basic BLocks into New Return Block
- JMI.addMBB(NewRetMBB);
- MachineBasicBlock::iterator SplicePt(JMI.getInstr());
- ++SplicePt;
- if (SplicePt != CurMBB->end())
- NewRetMBB->splice(NewRetMBB->end(), CurMBB, SplicePt);
-
- // Restructure Basic Blocks
- CurMBB->addSuccessor(NewRetMBB);
- CurMBB->addSuccessor(FailMBB);
-
- FinishFunction(FailMBB, NewRetMBB);
- return !Changed;
-}
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stack-protector-windows.ll b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stack-protector-windows.ll
index 6aefc5341da07..83b964a89d44d 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stack-protector-windows.ll
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-stack-protector-windows.ll
@@ -17,8 +17,12 @@ define void @caller() sspreq {
; CHECK-NEXT: ldr x8, [x8, :lo12:__security_cookie]
; CHECK-NEXT: str x8, [sp, #8]
; CHECK-NEXT: bl callee
-; CHECK-NEXT: ldr x0, [sp, #8]
-; CHECK-NEXT: bl __security_check_cookie
+; CHECK-NEXT: adrp x8, __security_cookie
+; CHECK-NEXT: ldr x9, [sp, #8]
+; CHECK-NEXT: ldr x8, [x8, :lo12:__security_cookie]
+; CHECK-NEXT: cmp x8, x9
+; CHECK-NEXT: b.ne .LBB0_2
+; CHECK-NEXT: // %bb.1: // %entry
; CHECK-NEXT: .seh_startepilogue
; CHECK-NEXT: ldr x30, [sp, #16] // 8-byte Folded Reload
; CHECK-NEXT: .seh_save_reg x30, 16
@@ -26,6 +30,10 @@ define void @caller() sspreq {
; CHECK-NEXT: .seh_stackalloc 32
; CHECK-NEXT: .seh_endepilogue
; CHECK-NEXT: ret
+; CHECK-NEXT: .LBB0_2: // %entry
+; CHECK-NEXT: ldr x0, [sp, #8]
+; CHECK-NEXT: bl __security_check_cookie
+; CHECK-NEXT: brk #0x1
; CHECK-NEXT: .seh_endfunclet
; CHECK-NEXT: .seh_endproc
entry:
diff --git a/llvm/test/CodeGen/X86/opt-pipeline.ll b/llvm/test/CodeGen/X86/opt-pipeline.ll
index d72f517cfb603..5dd368a9e1439 100644
--- a/llvm/test/CodeGen/X86/opt-pipeline.ll
+++ b/llvm/test/CodeGen/X86/opt-pipeline.ll
@@ -121,7 +121,6 @@
; CHECK-NEXT: Peephole Optimizations
; CHECK-NEXT: Remove dead machine instructions
; CHECK-NEXT: Live Range Shrink
-; CHECK-NEXT: X86 Windows Fixup Buffer Security Check
; CHECK-NEXT: X86 Fixup SetCC
; CHECK-NEXT: Lazy Machine Block Frequency Analysis
; CHECK-NEXT: X86 LEA Optimize
diff --git a/llvm/test/CodeGen/X86/stack-protector-msvc.ll b/llvm/test/CodeGen/X86/stack-protector-msvc.ll
index d718062d2c485..142c7d7b57b4b 100644
--- a/llvm/test/CodeGen/X86/st...
[truncated]
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
if (const Function *GuardCheckFn = TLI.getSSPStackGuardCheck(M)) { | ||
|
||
// First create the loads to the guard/stack slot for the comparison. | ||
EVT PtrTy = TLI.getPointerTy(DAG.getDataLayout()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should essentially never use getPointerTy. The correct pointer type should be implied by some value in context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stack protector is an invented value which doesn't have any inherent width from the IR or datalayout. The width is part of the ABI. In practice, on CPU targets it's "pointer width".
We could add a method to TargetLowering, I guess. Something like virtual MVT getStackProtectorTy(const DataLayout &DL) { return getPointerTy(DL, 0); }
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it's really just getFrameIndexTy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in latest update.
Align Align = DAG.getDataLayout().getPrefTypeAlign( | ||
PointerType::get(M.getContext(), 0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use the correct address space pointer, implied from the value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in latest update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all makes sense to me; my only high-level question is if we care about preserving the old codegen at -Oz.
if (const Function *GuardCheckFn = TLI.getSSPStackGuardCheck(M)) { | ||
|
||
// First create the loads to the guard/stack slot for the comparison. | ||
EVT PtrTy = TLI.getPointerTy(DAG.getDataLayout()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stack protector is an invented value which doesn't have any inherent width from the IR or datalayout. The width is part of the ABI. In practice, on CPU targets it's "pointer width".
We could add a method to TargetLowering, I guess. Something like virtual MVT getStackProtectorTy(const DataLayout &DL) { return getPointerTy(DL, 0); }
.
32e051b
to
86ffb24
Compare
Thanks! .. I think we can skip this behavior in case of Oz, for that I am making a change to skip this optimization in the presence of MinSize enabled. |
Is there anything I can do to help with this? |
Sorry for the delay regarding, I was OOO. Will post an update during the week. Thanks |
86ffb24
to
e935988
Compare
Fixed in latest update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…#143500) The assertions in llvm/test/CodeGen/X86/tailcc-ssp.ll were outdated. The initial comment indicated they were generated with `utils/update_llc_test_checks.py UTC_ARGS: --version 5`, but this was not accurate based on the file's content. Running `utils/update_llc_test_checks.py` regenerated the assertions, aligning them with the current `llc` output. This commit ensures that the test's claimed behavior accurately reflects the actual `llc` output, even though the tests were already passing. This was identified by @efriedma-quic during review of #136290. Submitting a separate PR to make sure these changes stay isolated.
…llvm#95904)" This reverts commit 854bbc5.
Co-authored-by: Eli Friedman <[email protected]>
Co-authored-by: Eli Friedman <[email protected]>
16d3cb5
to
6a7151f
Compare
…llvm#143500) The assertions in llvm/test/CodeGen/X86/tailcc-ssp.ll were outdated. The initial comment indicated they were generated with `utils/update_llc_test_checks.py UTC_ARGS: --version 5`, but this was not accurate based on the file's content. Running `utils/update_llc_test_checks.py` regenerated the assertions, aligning them with the current `llc` output. This commit ensures that the test's claimed behavior accurately reflects the actual `llc` output, even though the tests were already passing. This was identified by @efriedma-quic during review of llvm#136290. Submitting a separate PR to make sure these changes stay isolated.
This patch optimizes the Windows security cookie check mechanism by moving the comparison inline and only calling __security_check_cookie when the check fails. This reduces the overhead of making a DLL call for every function return. Previously, we implemented this optimization through a machine pass (X86WinFixupBufferSecurityCheckPass) in PR llvm#95904 submitted by @mahesh-attarde. We have reverted that pass in favor of this new approach. Also we have abandoned the AArch64 specific implementation of same pass in PR llvm#121938 in favor of this more general solution. The old machine instruction pass approach: - Scanned the generated code to find __security_check_cookie calls - Modified these calls by splitting basic blocks - Added comparison logic and conditional branching - Required complex block management and live register computation The new approach: - Implements the same optimization during instruction selection - Directly emits the comparison and conditional branching - No need for post-processing or basic block manipulation - Disables optimization at -Oz. Thanks @tamaspetz, @efriedma-quic and @arsenm for their help.
…llvm#143500) The assertions in llvm/test/CodeGen/X86/tailcc-ssp.ll were outdated. The initial comment indicated they were generated with `utils/update_llc_test_checks.py UTC_ARGS: --version 5`, but this was not accurate based on the file's content. Running `utils/update_llc_test_checks.py` regenerated the assertions, aligning them with the current `llc` output. This commit ensures that the test's claimed behavior accurately reflects the actual `llc` output, even though the tests were already passing. This was identified by @efriedma-quic during review of llvm#136290. Submitting a separate PR to make sure these changes stay isolated.
This patch optimizes the Windows security cookie check mechanism by moving the comparison inline and only calling __security_check_cookie when the check fails. This reduces the overhead of making a DLL call for every function return.
Previously, we implemented this optimization through a dedicated pass (X86WinFixupBufferSecurityCheckPass) in PR #95904 submitted by @mahesh-attarde, but this solution appears generic and efficient. We have reverted that pass in favor of this new approach.
X86WinFixupBufferSecurityCheckPass is a machine-level pass that:
This new approach is simpler and more maintainable:
We will abandon the AArch64 specific implementation of AArch64WinFixupBufferSecurityCheck pass in PR #121938 in favor of this more general solution.
Note: This solution is inspired from #121938 (comment) by @tamaspetz