Skip to content

[llvm-nm] Introduce synthetic flag #138232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions llvm/docs/CommandGuide/llvm-nm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,11 @@ OPTIONS

Do not filter special symbols from the output.

.. option:: --synthetic
Include synthetic symbols in the output. Synthetic symbols are not present in
the original object file but are generated by the tool to represent relevant
runtime or linking information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps

Include synthetic symbols in the output. Synthetic symbols are not present in the original object file. This currently includes symbols derived from PLT entries for ELF.

I find
"are generated by the tool to represent relevant runtime or linking information." very vague and not very useful. The PLT detail seems important.


.. option:: --undefined-only, -u

Print only undefined symbols.
Expand Down
90 changes: 90 additions & 0 deletions llvm/test/tools/llvm-nm/AArch64/synthetic.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
## Test --synthetic flag.

# RUN: yaml2obj %s -o %t

# RUN: llvm-nm %t | count 0
# RUN: llvm-nm %t --synthetic | FileCheck %s

--- !ELF
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
Type: ET_DYN
Machine: EM_AARCH64
Sections:
- Name: .rela.plt
Type: SHT_RELA
Flags: [ SHF_ALLOC, SHF_INFO_LINK ]
Address: 0x540
Link: .dynsym
AddressAlign: 0x8
Info: .got
Relocations:
- Offset: 0x10FA8
Symbol: __libc_start_main
Type: R_AARCH64_JUMP_SLOT
- Offset: 0x10FB0
Symbol: __cxa_finalize
Type: R_AARCH64_JUMP_SLOT
- Offset: 0x10FB8
Symbol: __gmon_start__
Type: R_AARCH64_JUMP_SLOT
- Offset: 0x10FC0
Symbol: abort
Type: R_AARCH64_JUMP_SLOT
- Offset: 0x10FC8
Symbol: puts
Type: R_AARCH64_JUMP_SLOT
- Name: .plt
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
Address: 0x5D0
AddressAlign: 0x10
Content: F07BBFA99000009011D247F910823E9120021FD61F2003D51F2003D51F2003D59000009011D647F910A23E9120021FD69000009011DA47F910C23E9120021FD69000009011DE47F910E23E9120021FD69000009011E247F910023F9120021FD69000009011E647F910223F9120021FD6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment illustrating why this and the .got have the contents they do could be useful for helping future test readers/maintainers.

- Name: .text
Type: SHT_PROGBITS
Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
Address: 0x640
- Name: .dynamic
Type: SHT_DYNAMIC
Flags: [ SHF_WRITE, SHF_ALLOC ]
Address: 0x10DA0
Link: .dynstr
AddressAlign: 0x8
Entries:
- Tag: DT_JMPREL
Value: 0x540
- Tag: DT_PLTRELSZ
Value: 0x78
- Tag: DT_PLTGOT
Value: 0x10F90
- Tag: DT_NULL
Value: 0x0
- Name: .got
Type: SHT_PROGBITS
Flags: [ SHF_WRITE, SHF_ALLOC ]
Address: 0x10F90
AddressAlign: 0x8
EntSize: 0x8
Content: 000000000000000000000000000000000000000000000000D005000000000000D005000000000000D005000000000000D005000000000000D005000000000000A00D01000000000000000000000000000000000000000000000000000000000054070000000000000000000000000000
DynamicSymbols:
- Name: __libc_start_main
Type: STT_FUNC
Binding: STB_GLOBAL
- Name: __cxa_finalize
Type: STT_FUNC
Binding: STB_WEAK
- Name: __gmon_start__
Binding: STB_WEAK
- Name: abort
Type: STT_FUNC
Binding: STB_GLOBAL
- Name: puts
Type: STT_FUNC
Binding: STB_GLOBAL

# CHECK: 0000000000000600 T __cxa_finalize@plt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing two or three @plt should be sufficient.
You can define a function in b.so and reference it from a.c using clang a.c b.so -nostdlib to eliminate gmon_start/libc_start_main references

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, if it is a major thing I would do it, just didn't want to waste time on this since 5 seems to be not a big issue :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a major thing. If you find patches to llvm-readobj/llvm-objdump, you will find @jh7370 and my requests to make tests minimal/sufficient to make test updating easier.

You could employ llvm/utils/update_test_body.py to make generating the test easier.

# CHECK-NEXT: 0000000000000610 T __gmon_start__@plt
# CHECK-NEXT: 00000000000005f0 T __libc_start_main@plt
# CHECK-NEXT: 0000000000000620 T abort@plt
# CHECK-NEXT: 0000000000000630 T puts@plt
1 change: 1 addition & 0 deletions llvm/tools/llvm-nm/Opts.td
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ defm radix : Eq<"radix", "Radix (o/d/x) for printing symbol Values">, MetaVarNam
def reverse_sort : FF<"reverse-sort", "Sort in reverse order">;
def size_sort : FF<"size-sort", "Sort symbols by size">;
def special_syms : FF<"special-syms", "Do not filter special symbols from the output">;
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">;
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">;

Since this is rendered as monospace font on the command-line, a double space here will look ugly. Also, please reflow this option so it spreads over multiple lines, since it is rather long.

def undefined_only : FF<"undefined-only", "Show only undefined symbols">;
def version : FF<"version", "Display the version">;
def without_aliases : FF<"without-aliases", "Exclude aliases from output">, Flags<[HelpHidden]>;
Expand Down
58 changes: 58 additions & 0 deletions llvm/tools/llvm-nm/llvm-nm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
#include "llvm/Demangle/Demangle.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/LLVMContext.h"
#include "llvm/MC/TargetRegistry.h"
#include "llvm/Object/Archive.h"
#include "llvm/Object/COFF.h"
#include "llvm/Object/COFFImportFile.h"
Expand Down Expand Up @@ -110,6 +111,7 @@ static bool PrintSize;
static bool Quiet;
static bool ReverseSort;
static bool SpecialSyms;
static bool SyntheticSyms;
static bool SizeSort;
static bool UndefinedOnly;
static bool WithoutAliases;
Expand Down Expand Up @@ -1783,6 +1785,55 @@ getDynamicSyms(SymbolicFile &Obj) {
return E->getDynamicSymbolIterators();
}

// Returns false if there is error found or true otherwise.
static bool getPltSyms(SymbolicFile &Obj, std::vector<NMSymbol> &SymbolList) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better for this function to return Error and for the caller to report the error. Perhaps even try Expected<std::vector<NMSymbol>> rather than using a parameter to record the results in, but that may not be a good idea, for performance reasons, depending on the size of the vector you'd often expect.

const auto *ELFObj = dyn_cast<ELFObjectFileBase>(&Obj);
if (!ELFObj)
return true;

std::string Err;
Triple TT;
TT.setArch(ELFObj->getArch());
TT.setOS(ELFObj->getOS());
const Target *TheTarget = TargetRegistry::lookupTarget(TT, Err);
if (!TheTarget) {
error("unable to find target for " + Obj.getFileName() + ": " + Err);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give an example of what the full error message looks like, if e.g. you use llvm-nm built without X86_64 support with one that does support that target, please? I'd like to be sure the message isn't too clunky.

return false;
}

std::unique_ptr<const MCSubtargetInfo> STI;
STI.reset(TheTarget->createMCSubtargetInfo(
TT.getTriple(), ELFObj->tryGetCPUName().value_or("").str(), ""));
if (!STI) {
error("unable to create subtarget info for " + Obj.getFileName());
return false;
}

for (auto Plt : ELFObj->getPltEntries(*STI)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style guide says to only use auto where it improves readability. As things stand, I can't tell what the type of Plt is here, which means it's harming readability here and should be expanded. Also, using auto here doesn't convey whether this is a reference or not, when it probably should be const &.

if (Plt.Symbol) {
SymbolRef Symbol(*Plt.Symbol, ELFObj);
if (Expected<StringRef> NameOrErr = Symbol.getName()) {
if (!NameOrErr->empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a little unfortunate that we're just ignoring such symbols. Perhaps it should be a warning too, or you could actually use it as an empty string (so you'd get @plt simply as the name). The latter feels most appropriate to me without thinking too hard about it at the moment.

NMSymbol S = {};
S.Address = Plt.Address;
S.Name = NameOrErr->str() + "@plt";
S.TypeChar = 'T';
S.SectionName = Plt.Section;
SymbolList.push_back(S);
}
} else {
consumeError(NameOrErr.takeError());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this isn't a warning?

}
} else {
WithColor::warning(errs(), ToolName)
<< "PLT entry at 0x" + Twine::utohexstr(Plt.Address)
<< " references an invalid symbol";
}
}

return true;
}

// Returns false if there is error found or true otherwise.
static bool getSymbolNamesFromObject(SymbolicFile &Obj,
std::vector<NMSymbol> &SymbolList) {
Expand All @@ -1807,6 +1858,12 @@ static bool getSymbolNamesFromObject(SymbolicFile &Obj,
<< toString(VersionsOrErr.takeError()) << "\n";
}
}

if (SyntheticSyms) {
if (!getPltSyms(Obj, SymbolList))
return false;
}

// If a "-s segname sectname" option was specified and this is a Mach-O
// file get the section number for that section in this object file.
unsigned int Nsect = 0;
Expand Down Expand Up @@ -2474,6 +2531,7 @@ int llvm_nm_main(int argc, char **argv, const llvm::ToolContext &) {
"(hexadecimal)");
SizeSort = Args.hasArg(OPT_size_sort);
SpecialSyms = Args.hasArg(OPT_special_syms);
SyntheticSyms = Args.hasArg(OPT_synthetic_syms);
UndefinedOnly = Args.hasArg(OPT_undefined_only);
WithoutAliases = Args.hasArg(OPT_without_aliases);

Expand Down