Skip to content

[llvm-nm] Introduce synthetic flag #138232

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

[llvm-nm] Introduce synthetic flag #138232

wants to merge 1 commit into from

Conversation

yota9
Copy link
Member

@yota9 yota9 commented May 2, 2025

Compatible with GNU nm --syntethic option is used to show special
symbols created by the linker. Current implementation is limited to show
plt entries in the form of symbol@plt and plt entry address. Currently
it would be used for BOLT testing purposes
(#135867)
in order to eliminate external GNU nm dependency.

@llvmbot
Copy link
Member

llvmbot commented May 2, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: Vladislav Khmelevsky (yota9)

Changes

Compatible with GNU nm --syntethic option is used to show special
symbols created by the linker. Current implementation is limited to show
plt entries in the form of symbol@plt and plt entry address. Currently
it would be used for BOLT testing purposes
(#135867)
in order to eliminate external GNU nm dependency.


Full diff: https://github.com/llvm/llvm-project/pull/138232.diff

3 Files Affected:

  • (added) llvm/test/tools/llvm-nm/AArch64/synthetic.test (+90)
  • (modified) llvm/tools/llvm-nm/Opts.td (+1)
  • (modified) llvm/tools/llvm-nm/llvm-nm.cpp (+58)
diff --git a/llvm/test/tools/llvm-nm/AArch64/synthetic.test b/llvm/test/tools/llvm-nm/AArch64/synthetic.test
new file mode 100644
index 0000000000000..c98770db6ab8a
--- /dev/null
+++ b/llvm/test/tools/llvm-nm/AArch64/synthetic.test
@@ -0,0 +1,90 @@
+## Test --synthetic flag.
+
+# RUN: yaml2obj %s -o %t
+
+# RUN: llvm-nm %t | count 0
+# RUN: llvm-nm %t --synthetic | FileCheck %s
+
+--- !ELF
+FileHeader:
+  Class:           ELFCLASS64
+  Data:            ELFDATA2LSB
+  Type:            ET_DYN
+  Machine:         EM_AARCH64
+Sections:
+  - Name:            .rela.plt
+    Type:            SHT_RELA
+    Flags:           [ SHF_ALLOC, SHF_INFO_LINK ]
+    Address:         0x540
+    Link:            .dynsym
+    AddressAlign:    0x8
+    Info:            .got
+    Relocations:
+      - Offset:          0x10FA8
+        Symbol:          __libc_start_main
+        Type:            R_AARCH64_JUMP_SLOT
+      - Offset:          0x10FB0
+        Symbol:          __cxa_finalize
+        Type:            R_AARCH64_JUMP_SLOT
+      - Offset:          0x10FB8
+        Symbol:          __gmon_start__
+        Type:            R_AARCH64_JUMP_SLOT
+      - Offset:          0x10FC0
+        Symbol:          abort
+        Type:            R_AARCH64_JUMP_SLOT
+      - Offset:          0x10FC8
+        Symbol:          puts
+        Type:            R_AARCH64_JUMP_SLOT
+  - Name:            .plt
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x5D0
+    AddressAlign:    0x10
+    Content:         F07BBFA99000009011D247F910823E9120021FD61F2003D51F2003D51F2003D59000009011D647F910A23E9120021FD69000009011DA47F910C23E9120021FD69000009011DE47F910E23E9120021FD69000009011E247F910023F9120021FD69000009011E647F910223F9120021FD6
+  - Name:            .text
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_ALLOC, SHF_EXECINSTR ]
+    Address:         0x640
+  - Name:            .dynamic
+    Type:            SHT_DYNAMIC
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x10DA0
+    Link:            .dynstr
+    AddressAlign:    0x8
+    Entries:
+      - Tag:             DT_JMPREL
+        Value:           0x540
+      - Tag:             DT_PLTRELSZ
+        Value:           0x78
+      - Tag:             DT_PLTGOT
+        Value:           0x10F90
+      - Tag:             DT_NULL
+        Value:           0x0
+  - Name:            .got
+    Type:            SHT_PROGBITS
+    Flags:           [ SHF_WRITE, SHF_ALLOC ]
+    Address:         0x10F90
+    AddressAlign:    0x8
+    EntSize:         0x8
+    Content:         000000000000000000000000000000000000000000000000D005000000000000D005000000000000D005000000000000D005000000000000D005000000000000A00D01000000000000000000000000000000000000000000000000000000000054070000000000000000000000000000
+DynamicSymbols:
+  - Name:            __libc_start_main
+    Type:            STT_FUNC
+    Binding:         STB_GLOBAL
+  - Name:            __cxa_finalize
+    Type:            STT_FUNC
+    Binding:         STB_WEAK
+  - Name:            __gmon_start__
+    Binding:         STB_WEAK
+  - Name:            abort
+    Type:            STT_FUNC
+    Binding:         STB_GLOBAL
+  - Name:            puts
+    Type:            STT_FUNC
+    Binding:         STB_GLOBAL
+
+# CHECK:      0000000000000600 T __cxa_finalize@plt
+# CHECK-NEXT: 0000000000000610 T __gmon_start__@plt
+# CHECK-NEXT: 00000000000005f0 T __libc_start_main@plt
+# CHECK-NEXT: 0000000000000620 T abort@plt
+# CHECK-NEXT: 0000000000000630 T puts@plt
diff --git a/llvm/tools/llvm-nm/Opts.td b/llvm/tools/llvm-nm/Opts.td
index 04d9f5db5cf85..b534c3a26a8f8 100644
--- a/llvm/tools/llvm-nm/Opts.td
+++ b/llvm/tools/llvm-nm/Opts.td
@@ -35,6 +35,7 @@ defm radix : Eq<"radix", "Radix (o/d/x) for printing symbol Values">, MetaVarNam
 def reverse_sort : FF<"reverse-sort", "Sort in reverse order">;
 def size_sort : FF<"size-sort", "Sort symbols by size">;
 def special_syms : FF<"special-syms", "Do not filter special symbols from the output">;
+def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output.  These are special symbols created by the linker for various purposes">;
 def undefined_only : FF<"undefined-only", "Show only undefined symbols">;
 def version : FF<"version", "Display the version">;
 def without_aliases : FF<"without-aliases", "Exclude aliases from output">, Flags<[HelpHidden]>;
diff --git a/llvm/tools/llvm-nm/llvm-nm.cpp b/llvm/tools/llvm-nm/llvm-nm.cpp
index ff07fbbaa5351..16db73c47f027 100644
--- a/llvm/tools/llvm-nm/llvm-nm.cpp
+++ b/llvm/tools/llvm-nm/llvm-nm.cpp
@@ -23,6 +23,7 @@
 #include "llvm/Demangle/Demangle.h"
 #include "llvm/IR/Function.h"
 #include "llvm/IR/LLVMContext.h"
+#include "llvm/MC/TargetRegistry.h"
 #include "llvm/Object/Archive.h"
 #include "llvm/Object/COFF.h"
 #include "llvm/Object/COFFImportFile.h"
@@ -110,6 +111,7 @@ static bool PrintSize;
 static bool Quiet;
 static bool ReverseSort;
 static bool SpecialSyms;
+static bool SyntheticSyms;
 static bool SizeSort;
 static bool UndefinedOnly;
 static bool WithoutAliases;
@@ -1783,6 +1785,55 @@ getDynamicSyms(SymbolicFile &Obj) {
   return E->getDynamicSymbolIterators();
 }
 
+// Returns false if there is error found or true otherwise.
+static bool getPltSyms(SymbolicFile &Obj, std::vector<NMSymbol> &SymbolList) {
+  const auto *ELFObj = dyn_cast<ELFObjectFileBase>(&Obj);
+  if (!ELFObj)
+    return true;
+
+  std::string Err;
+  Triple TT;
+  TT.setArch(ELFObj->getArch());
+  TT.setOS(ELFObj->getOS());
+  const Target *TheTarget = TargetRegistry::lookupTarget(TT, Err);
+  if (!TheTarget) {
+    error("unable to find target for " + Obj.getFileName() + ": " + Err);
+    return false;
+  }
+
+  MCSubtargetInfo *STI = TheTarget->createMCSubtargetInfo(
+      TT.getTriple(), ELFObj->tryGetCPUName().value_or("").str(), "");
+  if (!STI) {
+    error("unable to create subtarget info for " + Obj.getFileName() + ": " +
+          Err);
+    return false;
+  }
+
+  for (auto Plt : ELFObj->getPltEntries(*STI)) {
+    if (Plt.Symbol) {
+      SymbolRef Symbol(*Plt.Symbol, ELFObj);
+      if (Expected<StringRef> NameOrErr = Symbol.getName()) {
+        if (!NameOrErr->empty()) {
+          NMSymbol S = {};
+          S.Address = Plt.Address;
+          S.Name = NameOrErr->str() + "@plt";
+          S.TypeChar = 'T';
+          S.SectionName = Plt.Section;
+          SymbolList.push_back(S);
+        }
+      } else {
+        consumeError(NameOrErr.takeError());
+      }
+    } else {
+      WithColor::warning(errs(), ToolName)
+          << "PLT entry at 0x" + Twine::utohexstr(Plt.Address)
+          << " references an invalid symbol";
+    }
+  }
+
+  return true;
+}
+
 // Returns false if there is error found or true otherwise.
 static bool getSymbolNamesFromObject(SymbolicFile &Obj,
                                      std::vector<NMSymbol> &SymbolList) {
@@ -1807,6 +1858,12 @@ static bool getSymbolNamesFromObject(SymbolicFile &Obj,
             << toString(VersionsOrErr.takeError()) << "\n";
     }
   }
+
+  if (SyntheticSyms) {
+    if (!getPltSyms(Obj, SymbolList))
+      return false;
+  }
+
   // If a "-s segname sectname" option was specified and this is a Mach-O
   // file get the section number for that section in this object file.
   unsigned int Nsect = 0;
@@ -2474,6 +2531,7 @@ int llvm_nm_main(int argc, char **argv, const llvm::ToolContext &) {
           "(hexadecimal)");
   SizeSort = Args.hasArg(OPT_size_sort);
   SpecialSyms = Args.hasArg(OPT_special_syms);
+  SyntheticSyms = Args.hasArg(OPT_synthetic_syms);
   UndefinedOnly = Args.hasArg(OPT_undefined_only);
   WithoutAliases = Args.hasArg(OPT_without_aliases);
 

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, it's likely to be a few weeks before I can take a look at this. In the meantime, please add the option to the command guide (llvm/docs/CommandGuide/llvm-nm.rst). Please also see if the pre-merge failures are anything to do with your change.

@yota9
Copy link
Member Author

yota9 commented May 2, 2025

@jh7370 Hello! Sure, would add couple of lines to documentation. The build failures seems to be unrelated, all tests are passed, I would check out more precise a bit later. Thank you, waiting for your review!
UPD the fail is related to flang, so not related.

Compatible with GNU nm --syntethic option is used to show special
symbols created by the linker. Current implementation is limited to show
plt entries in the form of symbol@plt and plt entry address. Currently
it would be used for BOLT testing purposes
(llvm#135867)
in order to eliminate external GNU nm dependency.
Type: STT_FUNC
Binding: STB_GLOBAL

# CHECK: 0000000000000600 T __cxa_finalize@plt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing two or three @plt should be sufficient.
You can define a function in b.so and reference it from a.c using clang a.c b.so -nostdlib to eliminate gmon_start/libc_start_main references

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, if it is a major thing I would do it, just didn't want to waste time on this since 5 seems to be not a big issue :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a major thing. If you find patches to llvm-readobj/llvm-objdump, you will find @jh7370 and my requests to make tests minimal/sufficient to make test updating easier.

You could employ llvm/utils/update_test_body.py to make generating the test easier.

@aaupov
Copy link
Contributor

aaupov commented May 2, 2025

Thank you, beat me to it. Endorsing this, but deferring to binary utilities reviewers.

.. option:: --synthetic
Include synthetic symbols in the output. Synthetic symbols are not present in
the original object file but are generated by the tool to represent relevant
runtime or linking information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps

Include synthetic symbols in the output. Synthetic symbols are not present in the original object file. This currently includes symbols derived from PLT entries for ELF.

I find
"are generated by the tool to represent relevant runtime or linking information." very vague and not very useful. The PLT detail seems important.

aaupov added a commit that referenced this pull request May 14, 2025
Some testing configurations don't support `nm --synthetic` option (where
nm is a symlink to llvm-nm), which prevents us from resolving `puts@plt`
symbol address in profile generation inside the test.
Disable the check until llvm-nm support is landed (#138232).

Test Plan: bin/llvm-lit -a tools/bolt/test/X86/callcont-fallthru.s
paschalis-mpeis added a commit that referenced this pull request May 15, 2025
PR #139953 used `DONTRUN` to disable some run lines, but that
didn't work. Now switching to `RUN-DISABLED` for disabling the
tests until llvm-nm support is  landed (#138232).
paschalis-mpeis added a commit that referenced this pull request May 15, 2025
…0026)

PR #139953 used `DONTRUN` to disable some run lines, but that didn't
work. Now switching to `RUN-DISABLED` for disabling the tests until
llvm-nm support is landed (#138232).
TIFitis pushed a commit to TIFitis/llvm-project that referenced this pull request May 19, 2025
…m#140026)

PR llvm#139953 used `DONTRUN` to disable some run lines, but that didn't
work. Now switching to `RUN-DISABLED` for disabling the tests until
llvm-nm support is landed (llvm#138232).
Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally found some time to catch up on old reviews that have been sat idle.

Please could you add test cases that exercise the error paths. Also a test case that shows that when --synthetic is used on a non-ELF object, no error is produced.

Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
Address: 0x5D0
AddressAlign: 0x10
Content: F07BBFA99000009011D247F910823E9120021FD61F2003D51F2003D51F2003D59000009011D647F910A23E9120021FD69000009011DA47F910C23E9120021FD69000009011DE47F910E23E9120021FD69000009011E247F910023F9120021FD69000009011E647F910223F9120021FD6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment illustrating why this and the .got have the contents they do could be useful for helping future test readers/maintainers.

@@ -35,6 +35,7 @@ defm radix : Eq<"radix", "Radix (o/d/x) for printing symbol Values">, MetaVarNam
def reverse_sort : FF<"reverse-sort", "Sort in reverse order">;
def size_sort : FF<"size-sort", "Sort symbols by size">;
def special_syms : FF<"special-syms", "Do not filter special symbols from the output">;
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">;
def synthetic_syms : FF<"synthetic", "Include synthetic symbols in the output. These are special symbols created by the linker for various purposes">;

Since this is rendered as monospace font on the command-line, a double space here will look ugly. Also, please reflow this option so it spreads over multiple lines, since it is rather long.

@@ -1783,6 +1785,55 @@ getDynamicSyms(SymbolicFile &Obj) {
return E->getDynamicSymbolIterators();
}

// Returns false if there is error found or true otherwise.
static bool getPltSyms(SymbolicFile &Obj, std::vector<NMSymbol> &SymbolList) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better for this function to return Error and for the caller to report the error. Perhaps even try Expected<std::vector<NMSymbol>> rather than using a parameter to record the results in, but that may not be a good idea, for performance reasons, depending on the size of the vector you'd often expect.

TT.setOS(ELFObj->getOS());
const Target *TheTarget = TargetRegistry::lookupTarget(TT, Err);
if (!TheTarget) {
error("unable to find target for " + Obj.getFileName() + ": " + Err);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give an example of what the full error message looks like, if e.g. you use llvm-nm built without X86_64 support with one that does support that target, please? I'd like to be sure the message isn't too clunky.

return false;
}

for (auto Plt : ELFObj->getPltEntries(*STI)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style guide says to only use auto where it improves readability. As things stand, I can't tell what the type of Plt is here, which means it's harming readability here and should be expanded. Also, using auto here doesn't convey whether this is a reference or not, when it probably should be const &.

SymbolList.push_back(S);
}
} else {
consumeError(NameOrErr.takeError());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this isn't a warning?

if (Plt.Symbol) {
SymbolRef Symbol(*Plt.Symbol, ELFObj);
if (Expected<StringRef> NameOrErr = Symbol.getName()) {
if (!NameOrErr->empty()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a little unfortunate that we're just ignoring such symbols. Perhaps it should be a warning too, or you could actually use it as an empty string (so you'd get @plt simply as the name). The latter feels most appropriate to me without thinking too hard about it at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants