Skip to content

Commit c27223f

Browse files
committed
Get ykllvm to provide enough info to identify a zero-length call.
PT has this clause where returns are not compressed if the call is both direct and to the address immediately after the call. Section 33.4.2.2: > For near CALLs, push the Next IP onto the stack... Note that this > excludes zero-length CALLs, which are direct near CALLs with > displacement zero (to the next IP). These CALLs typically don’t have > matching RETs. For example, this kind of thing is never compressed: ``` 0x1234: call 0x1242 0x1242: pop rax ``` On x86_64 the instruction pointer register isn't addressable, so people sometimes use this trick to get its value. This change makes the compiler emit enough call information for the runtime to decide whether a call was "zero-length" (namely the return address of the call). It's not clear to me if this has ever bitten us, but it could be one of the causes of the rare PT decoding crashes that occasionally crop up.
1 parent e69ab01 commit c27223f

File tree

2 files changed

+34
-6
lines changed

2 files changed

+34
-6
lines changed

llvm/include/llvm/CodeGen/AsmPrinter.h

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,8 +249,15 @@ class AsmPrinter : public MachineFunctionPass {
249249
MCSection *YkLastBBAddrMapSection = nullptr;
250250

251251
/// Symbols marking the call instructions of each block. Used for the Yk JIT.
252+
///
253+
/// Values are a 3-tuple:
254+
/// - A symbol marking the call instruction.
255+
/// - A symbol marking the return address of the call (if it were to return
256+
/// by conventional means)
257+
/// - If it's a direct call, a symbol marking the target of the call, or
258+
/// `nullptr` if the call is indirect.
252259
std::map<const MachineBasicBlock *,
253-
SmallVector<std::tuple<MCSymbol *, MCSymbol *>>>
260+
SmallVector<std::tuple<MCSymbol *, MCSymbol *, MCSymbol *>>>
254261
YkCallMarkerSyms;
255262

256263
protected:

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

Lines changed: 26 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1528,8 +1528,10 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
15281528
for (auto Tup : YkCallMarkerSyms[&MBB]) {
15291529
// Emit address of the call instruction.
15301530
OutStreamer->emitSymbolValue(std::get<0>(Tup), getPointerSize());
1531+
// Emit the return address of the call.
1532+
OutStreamer->emitSymbolValue(std::get<1>(Tup), getPointerSize());
15311533
// Emit address of target if known, or 0.
1532-
MCSymbol *Target = std::get<1>(Tup);
1534+
MCSymbol *Target = std::get<2>(Tup);
15331535
if (Target)
15341536
OutStreamer->emitSymbolValue(Target, getPointerSize());
15351537
else
@@ -1988,15 +1990,33 @@ void AsmPrinter::emitFunctionBody() {
19881990
(MI.getOpcode() != TargetOpcode::STACKMAP) &&
19891991
(MI.getOpcode() != TargetOpcode::PATCHPOINT) &&
19901992
(MI.getOpcode() != TargetOpcode::STATEPOINT)) {
1993+
// Record the address of the call instruction itself.
19911994
MCSymbol *YkPreCallSym =
19921995
MF->getContext().createTempSymbol("yk_precall", true);
19931996
OutStreamer->emitLabel(YkPreCallSym);
1997+
1998+
// Codegen it as usual.
1999+
emitInstruction(&MI);
2000+
2001+
// Record the address of the instruction following the call. In other
2002+
// words, this is the return address of the call.
2003+
MCSymbol *YkPostCallSym =
2004+
MF->getContext().createTempSymbol("yk_postcall", true);
2005+
OutStreamer->emitLabel(YkPostCallSym);
2006+
2007+
// Figure out if this is a direct or indirect call.
2008+
//
2009+
// If it's direct, then we know the call's target from the first
2010+
// operand alone.
19942011
const MachineOperand CallOpnd = MI.getOperand(0);
19952012
MCSymbol *CallTargetSym = nullptr;
19962013
if (CallOpnd.isGlobal()) {
1997-
// Statically known function address.
2014+
// Direct call.
19982015
CallTargetSym = getSymbol(CallOpnd.getGlobal());
1999-
}
2016+
} else if (CallOpnd.isMCSymbol()) {
2017+
// Also a direct call.
2018+
CallTargetSym = CallOpnd.getMCSymbol();
2019+
} // Otherwise it's an indirect call.
20002020

20012021
// Ensure we are only working with near calls. This matters because
20022022
// Intel PT optimises near calls, and it simplifies our implementation
@@ -2005,10 +2025,11 @@ void AsmPrinter::emitFunctionBody() {
20052025
assert(!MF->getSubtarget().getInstrInfo()->isFarCall(MI));
20062026

20072027
assert(YkCallMarkerSyms.find(&MBB) != YkCallMarkerSyms.end());
2008-
YkCallMarkerSyms[&MBB].push_back({YkPreCallSym, CallTargetSym});
2028+
YkCallMarkerSyms[&MBB].push_back({YkPreCallSym, YkPostCallSym, CallTargetSym});
2029+
} else {
2030+
emitInstruction(&MI);
20092031
}
20102032

2011-
emitInstruction(&MI);
20122033
// Generate labels for function calls so we can record the correct
20132034
// instruction offset. The conditions for generating the label must be
20142035
// the same as the ones for generating the stackmap call in

0 commit comments

Comments
 (0)