-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[llvm-nm] Introduce synthetic flag #138232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-binary-utilities Author: Vladislav Khmelevsky (yota9) ChangesCompatible with GNU nm --syntethic option is used to show special Full diff: https://github.com/llvm/llvm-project/pull/138232.diff 3 Files Affected:
diff --git a/llvm/test/tools/llvm-nm/AArch64/synthetic.test b/llvm/test/tools/llvm-nm/AArch64/synthetic.test
new file mode 100644
index 0000000000000..c98770db6ab8a
--- /dev/null
+++ b/llvm/test/tools/llvm-nm/AArch64/synthetic.test
@@ -0,0 +1,90 @@
+## Test --synthetic flag.
+
+# RUN: yaml2obj %s -o %t
+
+# RUN: llvm-nm %t | count 0
+# RUN: llvm-nm %t --synthetic | FileCheck %s
+
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_DYN
+ Machine: EM_AARCH64
+Sections:
+ - Name: .rela.plt
+ Type: SHT_RELA
+ Flags: [ SHF_ALLOC, SHF_INFO_LINK ]
+ Address: 0x540
+ Link: .dynsym
+ AddressAlign: 0x8
+ Info: .got
+ Relocations:
+ - Offset: 0x10FA8
+ Symbol: __libc_start_main
+ Type: R_AARCH64_JUMP_SLOT
+ - Offset: 0x10FB0
+ Symbol: __cxa_finalize
+ Type: R_AARCH64_JUMP_SLOT
+ - Offset: 0x10FB8
+ Symbol: __gmon_start__
+ Type: R_AARCH64_JUMP_SLOT
+ - Offset: 0x10FC0
+ Symbol: abort
+ Type: R_AARCH64_JUMP_SLOT
+ - Offset: 0x10FC8
+ Symbol: puts
+ Type: R_AARCH64_JUMP_SLOT
+ - Name: .plt
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
+ Address: 0x5D0
+ AddressAlign: 0x10
+ Content: F07BBFA99000009011D247F910823E9120021FD61F2003D51F2003D51F2003D59000009011D647F910A23E9120021FD69000009011DA47F910C23E9120021FD69000009011DE47F910E23E9120021FD69000009011E247F910023F9120021FD69000009011E647F910223F9120021FD6
+ - Name: .text
+ Type: SHT_PROGBITS
+ Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
+ Address: 0x640
+ - Name: .dynamic
+ Type: SHT_DYNAMIC
+ Flags: [ SHF_WRITE, SHF_ALLOC ]
+ Address: 0x10DA0
+ Link: .dynstr
+ AddressAlign: 0x8
+ Entries:
+ - Tag: DT_JMPREL
+ Value: 0x540
+ - Tag: DT_PLTRELSZ
+ Value: 0x78
+ - Tag: DT_PLTGOT
+ Value: 0x10F90
+ - Tag: DT_NULL
+ Value: 0x0
+ - Name: .got
+ Type: SHT_PROGBITS
+ Flags: [ SHF_WRITE, SHF_ALLOC ]
+ Address: 0x10F90
+ AddressAlign: 0x8
+ EntSize: 0x8
+ Content: 000000000000000000000000000000000000000000000000D005000000000000D005000000000000D005000000000000D005000000000000D005000000000000A00D01000000000000000000000000000000000000000000000000000000000054070000000000000000000000000000
+DynamicSymbols:
+ - Name: __libc_start_main
+ Type: STT_FUNC
+ Binding: STB_GLOBAL
+ - Name: __cxa_finalize
+ Type: STT_FUNC
+ Binding: STB_WEAK
+ - Name: __gmon_start__
+ Binding: STB_WEAK
+ - Name: abort
+ Type: STT_FUNC
+ Binding: STB_GLOBAL
+ - Name: puts
+ Type: STT_FUNC
+ Binding: STB_GLOBAL
+
+# CHECK: 0000000000000600 T __cxa_finalize@plt
+# CHECK-NEXT: 0000000000000610 T __gmon_start__@plt
+# CHECK-NEXT: 00000000000005f0 T __libc_start_main@plt
+# CHECK-NEXT: 0000000000000620 T abort@plt
+# CHECK-NEXT: 0000000000000630 T puts@plt
diff --git a/llvm/tools/llvm-nm/Opts.td b/llvm/tools/llvm-nm/Opts.td
index 04d9f5db5cf85..b534c3a26a8f8 100644
--- a/llvm/tools/llvm-nm/Opts.td
+++ b/llvm/tools/llvm-nm/Opts.td
@@ -35,6 +35,7 @@ defm radix : Eq<"radix", "Radix (o/d/x) for printing symbol Values">, MetaVarNam
def reverse_sort : FF<"reverse-sort", "Sort in reverse order">;
def size_sort : FF<"size-sort", "Sort symbols by size">;
def special_syms : FF<"special-syms", "Do not filter special symbols from the output">;
+def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">;
def undefined_only : FF<"undefined-only", "Show only undefined symbols">;
def version : FF<"version", "Display the version">;
def without_aliases : FF<"without-aliases", "Exclude aliases from output">, Flags<[HelpHidden]>;
diff --git a/llvm/tools/llvm-nm/llvm-nm.cpp b/llvm/tools/llvm-nm/llvm-nm.cpp
index ff07fbbaa5351..16db73c47f027 100644
--- a/llvm/tools/llvm-nm/llvm-nm.cpp
+++ b/llvm/tools/llvm-nm/llvm-nm.cpp
@@ -23,6 +23,7 @@
#include "llvm/Demangle/Demangle.h"
#include "llvm/IR/Function.h"
#include "llvm/IR/LLVMContext.h"
+#include "llvm/MC/TargetRegistry.h"
#include "llvm/Object/Archive.h"
#include "llvm/Object/COFF.h"
#include "llvm/Object/COFFImportFile.h"
@@ -110,6 +111,7 @@ static bool PrintSize;
static bool Quiet;
static bool ReverseSort;
static bool SpecialSyms;
+static bool SyntheticSyms;
static bool SizeSort;
static bool UndefinedOnly;
static bool WithoutAliases;
@@ -1783,6 +1785,55 @@ getDynamicSyms(SymbolicFile &Obj) {
return E->getDynamicSymbolIterators();
}
+// Returns false if there is error found or true otherwise.
+static bool getPltSyms(SymbolicFile &Obj, std::vector<NMSymbol> &SymbolList) {
+ const auto *ELFObj = dyn_cast<ELFObjectFileBase>(&Obj);
+ if (!ELFObj)
+ return true;
+
+ std::string Err;
+ Triple TT;
+ TT.setArch(ELFObj->getArch());
+ TT.setOS(ELFObj->getOS());
+ const Target *TheTarget = TargetRegistry::lookupTarget(TT, Err);
+ if (!TheTarget) {
+ error("unable to find target for " + Obj.getFileName() + ": " + Err);
+ return false;
+ }
+
+ MCSubtargetInfo *STI = TheTarget->createMCSubtargetInfo(
+ TT.getTriple(), ELFObj->tryGetCPUName().value_or("").str(), "");
+ if (!STI) {
+ error("unable to create subtarget info for " + Obj.getFileName() + ": " +
+ Err);
+ return false;
+ }
+
+ for (auto Plt : ELFObj->getPltEntries(*STI)) {
+ if (Plt.Symbol) {
+ SymbolRef Symbol(*Plt.Symbol, ELFObj);
+ if (Expected<StringRef> NameOrErr = Symbol.getName()) {
+ if (!NameOrErr->empty()) {
+ NMSymbol S = {};
+ S.Address = Plt.Address;
+ S.Name = NameOrErr->str() + "@plt";
+ S.TypeChar = 'T';
+ S.SectionName = Plt.Section;
+ SymbolList.push_back(S);
+ }
+ } else {
+ consumeError(NameOrErr.takeError());
+ }
+ } else {
+ WithColor::warning(errs(), ToolName)
+ << "PLT entry at 0x" + Twine::utohexstr(Plt.Address)
+ << " references an invalid symbol";
+ }
+ }
+
+ return true;
+}
+
// Returns false if there is error found or true otherwise.
static bool getSymbolNamesFromObject(SymbolicFile &Obj,
std::vector<NMSymbol> &SymbolList) {
@@ -1807,6 +1858,12 @@ static bool getSymbolNamesFromObject(SymbolicFile &Obj,
<< toString(VersionsOrErr.takeError()) << "\n";
}
}
+
+ if (SyntheticSyms) {
+ if (!getPltSyms(Obj, SymbolList))
+ return false;
+ }
+
// If a "-s segname sectname" option was specified and this is a Mach-O
// file get the section number for that section in this object file.
unsigned int Nsect = 0;
@@ -2474,6 +2531,7 @@ int llvm_nm_main(int argc, char **argv, const llvm::ToolContext &) {
"(hexadecimal)");
SizeSort = Args.hasArg(OPT_size_sort);
SpecialSyms = Args.hasArg(OPT_special_syms);
+ SyntheticSyms = Args.hasArg(OPT_synthetic_syms);
UndefinedOnly = Args.hasArg(OPT_undefined_only);
WithoutAliases = Args.hasArg(OPT_without_aliases);
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, it's likely to be a few weeks before I can take a look at this. In the meantime, please add the option to the command guide (llvm/docs/CommandGuide/llvm-nm.rst). Please also see if the pre-merge failures are anything to do with your change.
@jh7370 Hello! Sure, would add couple of lines to documentation. The build failures seems to be unrelated, all tests are passed, I would check out more precise a bit later. Thank you, waiting for your review! |
Compatible with GNU nm --syntethic option is used to show special symbols created by the linker. Current implementation is limited to show plt entries in the form of symbol@plt and plt entry address. Currently it would be used for BOLT testing purposes (llvm#135867) in order to eliminate external GNU nm dependency.
Type: STT_FUNC | ||
Binding: STB_GLOBAL | ||
|
||
# CHECK: 0000000000000600 T __cxa_finalize@plt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing two or three @plt
should be sufficient.
You can define a function in b.so and reference it from a.c using clang a.c b.so -nostdlib
to eliminate gmon_start/libc_start_main references
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, if it is a major thing I would do it, just didn't want to waste time on this since 5 seems to be not a big issue :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's a major thing. If you find patches to llvm-readobj/llvm-objdump, you will find @jh7370 and my requests to make tests minimal/sufficient to make test updating easier.
You could employ llvm/utils/update_test_body.py
to make generating the test easier.
Thank you, beat me to it. Endorsing this, but deferring to binary utilities reviewers. |
.. option:: --synthetic | ||
Include synthetic symbols in the output. Synthetic symbols are not present in | ||
the original object file but are generated by the tool to represent relevant | ||
runtime or linking information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps
Include synthetic symbols in the output. Synthetic symbols are not present in the original object file. This currently includes symbols derived from PLT entries for ELF.
I find
"are generated by the tool to represent relevant runtime or linking information." very vague and not very useful. The PLT detail seems important.
Some testing configurations don't support `nm --synthetic` option (where nm is a symlink to llvm-nm), which prevents us from resolving `puts@plt` symbol address in profile generation inside the test. Disable the check until llvm-nm support is landed (#138232). Test Plan: bin/llvm-lit -a tools/bolt/test/X86/callcont-fallthru.s
…m#140026) PR llvm#139953 used `DONTRUN` to disable some run lines, but that didn't work. Now switching to `RUN-DISABLED` for disabling the tests until llvm-nm support is landed (llvm#138232).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally found some time to catch up on old reviews that have been sat idle.
Please could you add test cases that exercise the error paths. Also a test case that shows that when --synthetic is used on a non-ELF object, no error is produced.
Flags: [ SHF_ALLOC, SHF_EXECINSTR ] | ||
Address: 0x5D0 | ||
AddressAlign: 0x10 | ||
Content: F07BBFA99000009011D247F910823E9120021FD61F2003D51F2003D51F2003D59000009011D647F910A23E9120021FD69000009011DA47F910C23E9120021FD69000009011DE47F910E23E9120021FD69000009011E247F910023F9120021FD69000009011E647F910223F9120021FD6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment illustrating why this and the .got have the contents they do could be useful for helping future test readers/maintainers.
@@ -35,6 +35,7 @@ defm radix : Eq<"radix", "Radix (o/d/x) for printing symbol Values">, MetaVarNam | |||
def reverse_sort : FF<"reverse-sort", "Sort in reverse order">; | |||
def size_sort : FF<"size-sort", "Sort symbols by size">; | |||
def special_syms : FF<"special-syms", "Do not filter special symbols from the output">; | |||
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">; | |
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">; |
Since this is rendered as monospace font on the command-line, a double space here will look ugly. Also, please reflow this option so it spreads over multiple lines, since it is rather long.
@@ -1783,6 +1785,55 @@ getDynamicSyms(SymbolicFile &Obj) { | |||
return E->getDynamicSymbolIterators(); | |||
} | |||
|
|||
// Returns false if there is error found or true otherwise. | |||
static bool getPltSyms(SymbolicFile &Obj, std::vector<NMSymbol> &SymbolList) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better for this function to return Error
and for the caller to report the error. Perhaps even try Expected<std::vector<NMSymbol>>
rather than using a parameter to record the results in, but that may not be a good idea, for performance reasons, depending on the size of the vector you'd often expect.
TT.setOS(ELFObj->getOS()); | ||
const Target *TheTarget = TargetRegistry::lookupTarget(TT, Err); | ||
if (!TheTarget) { | ||
error("unable to find target for " + Obj.getFileName() + ": " + Err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you give an example of what the full error message looks like, if e.g. you use llvm-nm built without X86_64 support with one that does support that target, please? I'd like to be sure the message isn't too clunky.
return false; | ||
} | ||
|
||
for (auto Plt : ELFObj->getPltEntries(*STI)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM style guide says to only use auto
where it improves readability. As things stand, I can't tell what the type of Plt
is here, which means it's harming readability here and should be expanded. Also, using auto
here doesn't convey whether this is a reference or not, when it probably should be const &
.
SymbolList.push_back(S); | ||
} | ||
} else { | ||
consumeError(NameOrErr.takeError()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this isn't a warning?
if (Plt.Symbol) { | ||
SymbolRef Symbol(*Plt.Symbol, ELFObj); | ||
if (Expected<StringRef> NameOrErr = Symbol.getName()) { | ||
if (!NameOrErr->empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a little unfortunate that we're just ignoring such symbols. Perhaps it should be a warning too, or you could actually use it as an empty string (so you'd get @plt
simply as the name). The latter feels most appropriate to me without thinking too hard about it at the moment.
Compatible with GNU nm --syntethic option is used to show special
symbols created by the linker. Current implementation is limited to show
plt entries in the form of symbol@plt and plt entry address. Currently
it would be used for BOLT testing purposes
(#135867)
in order to eliminate external GNU nm dependency.