Skip to content

Commit 7228670

Browse files
committed
[MC][NFC] Allow MCInstrAnalysis to store state
Currently, all the analysis functions provided by `MCInstrAnalysis` work on a single instruction. On some targets, this limits the kind of instructions that can be successfully analyzed as common constructs may need multiple instructions. For example, a typical call sequence on RISC-V uses a auipc+jalr pair. In order to analyse the jalr inside `evaluateBranch`, information about the corresponding auipc is needed. Similarly, AArch64 uses adrp+ldr pairs to access globals. This patch proposes to add state to `MCInstrAnalysis` to support these use cases. Two new virtual methods are added: - `updateState`: takes an instruction and its address. This methods should be called by clients on every instruction and allows targets to store whatever information they need to analyse future instructions. - `resetState`: clears the state whenever it becomes irrelevant. Clients could call this, for example, when starting to disassemble a new function. Note that the default implementations do nothing so this patch is NFC. No actual state is stored inside `MCInstrAnalysis`; deciding the structure of the state is left to the targets. This patch also modifies llvm-objdump to use the new interface. This patch is an alternative to D116677 and the idea of storing state in `MCInstrAnalysis` was first discussed there.
1 parent fdb29f7 commit 7228670

File tree

2 files changed

+31
-5
lines changed

2 files changed

+31
-5
lines changed

llvm/include/llvm/MC/MCInstrAnalysis.h

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,21 @@ class MCInstrAnalysis {
3737
MCInstrAnalysis(const MCInstrInfo *Info) : Info(Info) {}
3838
virtual ~MCInstrAnalysis() = default;
3939

40+
/// Clear the internal state. See updateState for more information.
41+
virtual void resetState() {}
42+
43+
/// Update internal state with \p Inst at \p Addr.
44+
///
45+
/// For some types a analyses, inspecting a single instruction is not
46+
/// sufficient. Some examples are auipc/jalr pairs on RISC-V or adrp/ldr pairs
47+
/// on AArch64. To support inspecting multiple instructions, targets may keep
48+
/// track of an internal state while analysing instructions. Clients should
49+
/// call updateState for every instruction which allows later calls to one of
50+
/// the analysis functions to take previous instructions into account.
51+
/// Whenever state becomes irrelevant (e.g., when starting to disassemble a
52+
/// new function), clients should call resetState to clear it.
53+
virtual void updateState(const MCInst &Inst, uint64_t Addr) {}
54+
4055
virtual bool isBranch(const MCInst &Inst) const {
4156
return Info->get(Inst.getOpcode()).isBranch();
4257
}

llvm/tools/llvm-objdump/llvm-objdump.cpp

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -842,7 +842,7 @@ class DisassemblerTarget {
842842
std::unique_ptr<const MCSubtargetInfo> SubtargetInfo;
843843
std::shared_ptr<MCContext> Context;
844844
std::unique_ptr<MCDisassembler> DisAsm;
845-
std::shared_ptr<const MCInstrAnalysis> InstrAnalysis;
845+
std::shared_ptr<MCInstrAnalysis> InstrAnalysis;
846846
std::shared_ptr<MCInstPrinter> InstPrinter;
847847
PrettyPrinter *Printer;
848848

@@ -1265,14 +1265,19 @@ collectBBAddrMapLabels(const std::unordered_map<uint64_t, BBAddrMap> &AddrToBBAd
12651265
}
12661266
}
12671267

1268-
static void collectLocalBranchTargets(
1269-
ArrayRef<uint8_t> Bytes, const MCInstrAnalysis *MIA, MCDisassembler *DisAsm,
1270-
MCInstPrinter *IP, const MCSubtargetInfo *STI, uint64_t SectionAddr,
1271-
uint64_t Start, uint64_t End, std::unordered_map<uint64_t, std::string> &Labels) {
1268+
static void
1269+
collectLocalBranchTargets(ArrayRef<uint8_t> Bytes, MCInstrAnalysis *MIA,
1270+
MCDisassembler *DisAsm, MCInstPrinter *IP,
1271+
const MCSubtargetInfo *STI, uint64_t SectionAddr,
1272+
uint64_t Start, uint64_t End,
1273+
std::unordered_map<uint64_t, std::string> &Labels) {
12721274
// So far only supports PowerPC and X86.
12731275
if (!STI->getTargetTriple().isPPC() && !STI->getTargetTriple().isX86())
12741276
return;
12751277

1278+
if (MIA)
1279+
MIA->resetState();
1280+
12761281
Labels.clear();
12771282
unsigned LabelCount = 0;
12781283
Start += SectionAddr;
@@ -1298,6 +1303,7 @@ static void collectLocalBranchTargets(
12981303
!Labels.count(Target) &&
12991304
!(STI->getTargetTriple().isPPC() && Target == Index))
13001305
Labels[Target] = ("L" + Twine(LabelCount++)).str();
1306+
MIA->updateState(Inst, Index);
13011307
}
13021308
Index += Size;
13031309
}
@@ -1939,6 +1945,9 @@ disassembleObject(ObjectFile &Obj, const ObjectFile &DbgObj,
19391945
BBAddrMapLabels);
19401946
}
19411947

1948+
if (DT->InstrAnalysis)
1949+
DT->InstrAnalysis->resetState();
1950+
19421951
while (Index < End) {
19431952
// ARM and AArch64 ELF binaries can interleave data and text in the
19441953
// same section. We rely on the markers introduced to understand what
@@ -2155,6 +2164,8 @@ disassembleObject(ObjectFile &Obj, const ObjectFile &DbgObj,
21552164
if (TargetOS == &CommentStream)
21562165
*TargetOS << "\n";
21572166
}
2167+
2168+
DT->InstrAnalysis->updateState(Inst, SectionAddr + Index);
21582169
}
21592170
}
21602171

0 commit comments

Comments
 (0)