Skip to content

[lldb][Mach-O] Don't read symbol table of specially marked binary #129967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jasonmolenda
Copy link
Collaborator

We have a binary image on Darwin that has no code, only metadata. It has a large symbol table with many external symbol names that will not be needed in the debugger. And it is possible to not have this binary on the debugger system - so lldb must read all of the symbol names out of memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to indicate that lldb should not read the nlist symbols for it when we are reading out of memory. If lldb is run with an on-disk version of the binary, we will load the symbol table as we normally would, there's no benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section, launch the process. The main binary deletes the dylib from the disk so lldb is forced to read it out of memory. lldb attaches to the binary, confirms that the dylib is present in the process and is a memory Module. If the binary is not present, or lldb found the on-disk copy because it hasn't been deleted yet, we delete the target, flush the Debugger's module cache, sleep and retry, up to ten times. I create the specially named section by compiling an assembly file that puts a byte in the section which makes for a bit of a messy Makefile (the pre-canned actions to build a dylib don't quite handle this case) but I don't think it's much of a problem. This is a purely skipUnlessDarwin test case.

rdar://146167816

We have a binary image on Darwin that has no code, only metadata.
It has a large symbol table with many external symbol names that
will not be needed in the debugger.  And it is possible to not have
this binary on the debugger system - so lldb must read all of the
symbol names out of memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to
indicate that lldb should not read the nlist symbols for it when
we are reading out of memory.  If lldb is run with an on-disk version
of the binary, we will load the symbol table as we normally would,
there's no benefit to handling this binary differently.

I added a test where I create a dylib with this specially named
section, launch the process.  The main binary deletes the dylib
from the disk so lldb is forced to read it out of memory.  lldb
attaches to the binary, confirms that the dylib is present in
the process and is a memory Module.  If the binary is not present,
or lldb found the on-disk copy because it hasn't been deleted yet,
we delete the target, flush the Debugger's module cache, sleep and
retry, up to ten times.  I create the specially named section by
compiling an assembly file that puts a byte in the section which
makes for a bit of a messy Makefile (the pre-canned actions to
build a dylib don't quite handle this case) but I don't think it's
much of a problem.  This is a purely skipUnlessDarwin test case.

rdar://146167816
@jasonmolenda jasonmolenda requested a review from jimingham March 6, 2025 01:35
@llvmbot llvmbot added the lldb label Mar 6, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 6, 2025

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

Changes

We have a binary image on Darwin that has no code, only metadata. It has a large symbol table with many external symbol names that will not be needed in the debugger. And it is possible to not have this binary on the debugger system - so lldb must read all of the symbol names out of memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to indicate that lldb should not read the nlist symbols for it when we are reading out of memory. If lldb is run with an on-disk version of the binary, we will load the symbol table as we normally would, there's no benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section, launch the process. The main binary deletes the dylib from the disk so lldb is forced to read it out of memory. lldb attaches to the binary, confirms that the dylib is present in the process and is a memory Module. If the binary is not present, or lldb found the on-disk copy because it hasn't been deleted yet, we delete the target, flush the Debugger's module cache, sleep and retry, up to ten times. I create the specially named section by compiling an assembly file that puts a byte in the section which makes for a bit of a messy Makefile (the pre-canned actions to build a dylib don't quite handle this case) but I don't think it's much of a problem. This is a purely skipUnlessDarwin test case.

rdar://146167816


Full diff: https://github.com/llvm/llvm-project/pull/129967.diff

7 Files Affected:

  • (modified) lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp (+44-24)
  • (modified) lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h (+1)
  • (added) lldb/test/API/macosx/no-nlist-memory-module/Makefile (+15)
  • (added) lldb/test/API/macosx/no-nlist-memory-module/TestNoNlistsDylib.py (+70)
  • (added) lldb/test/API/macosx/no-nlist-memory-module/main.c (+37)
  • (added) lldb/test/API/macosx/no-nlist-memory-module/no-nlist-sect.s (+3)
  • (added) lldb/test/API/macosx/no-nlist-memory-module/no-nlists.c (+3)
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
index a19322ff1e263..f31b56b9f81e6 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
@@ -918,6 +918,11 @@ ConstString ObjectFileMachO::GetSectionNameEHFrame() {
   return g_section_name_eh_frame;
 }
 
+ConstString ObjectFileMachO::GetSectionNameLLDBNoNlist() {
+  static ConstString g_section_name_lldb_no_nlist("__lldb_no_nlist");
+  return g_section_name_lldb_no_nlist;
+}
+
 bool ObjectFileMachO::MagicBytesMatch(DataBufferSP data_sp,
                                       lldb::addr_t data_offset,
                                       lldb::addr_t data_length) {
@@ -2394,8 +2399,39 @@ void ObjectFileMachO::ParseSymtab(Symtab &symtab) {
   uint32_t memory_module_load_level = eMemoryModuleLoadLevelComplete;
   bool is_shared_cache_image = IsSharedCacheBinary();
   bool is_local_shared_cache_image = is_shared_cache_image && !IsInMemory();
+
+  ConstString g_segment_name_TEXT = GetSegmentNameTEXT();
+  ConstString g_segment_name_DATA = GetSegmentNameDATA();
+  ConstString g_segment_name_DATA_DIRTY = GetSegmentNameDATA_DIRTY();
+  ConstString g_segment_name_DATA_CONST = GetSegmentNameDATA_CONST();
+  ConstString g_segment_name_OBJC = GetSegmentNameOBJC();
+  ConstString g_section_name_eh_frame = GetSectionNameEHFrame();
+  ConstString g_section_name_lldb_no_nlist = GetSectionNameLLDBNoNlist();
+  SectionSP text_section_sp(
+      section_list->FindSectionByName(g_segment_name_TEXT));
+  SectionSP data_section_sp(
+      section_list->FindSectionByName(g_segment_name_DATA));
   SectionSP linkedit_section_sp(
       section_list->FindSectionByName(GetSegmentNameLINKEDIT()));
+  SectionSP data_dirty_section_sp(
+      section_list->FindSectionByName(g_segment_name_DATA_DIRTY));
+  SectionSP data_const_section_sp(
+      section_list->FindSectionByName(g_segment_name_DATA_CONST));
+  SectionSP objc_section_sp(
+      section_list->FindSectionByName(g_segment_name_OBJC));
+  SectionSP eh_frame_section_sp;
+  SectionSP lldb_no_nlist_section_sp;
+  if (text_section_sp.get()) {
+    eh_frame_section_sp = text_section_sp->GetChildren().FindSectionByName(
+        g_section_name_eh_frame);
+    lldb_no_nlist_section_sp = text_section_sp->GetChildren().FindSectionByName(
+        g_section_name_lldb_no_nlist);
+  } else {
+    eh_frame_section_sp =
+        section_list->FindSectionByName(g_section_name_eh_frame);
+    lldb_no_nlist_section_sp =
+        section_list->FindSectionByName(g_section_name_lldb_no_nlist);
+  }
 
   if (process && m_header.filetype != llvm::MachO::MH_OBJECT &&
       !is_local_shared_cache_image) {
@@ -2403,6 +2439,14 @@ void ObjectFileMachO::ParseSymtab(Symtab &symtab) {
 
     memory_module_load_level = target.GetMemoryModuleLoadLevel();
 
+    // If __TEXT,__lldb_no_nlist section is present in this binary,
+    // and we're reading it out of memory, do not read any of the
+    // nlist entries.  They are not needed in lldb and it may be
+    // expensive to load these.  This is to handle a dylib consisting
+    // of only metadata, no code, but it has many nlist entries.
+    if (lldb_no_nlist_section_sp)
+      memory_module_load_level = eMemoryModuleLoadLevelMinimal;
+
     // Reading mach file from memory in a process or core file...
 
     if (linkedit_section_sp) {
@@ -2526,30 +2570,6 @@ void ObjectFileMachO::ParseSymtab(Symtab &symtab) {
 
   const bool have_strtab_data = strtab_data.GetByteSize() > 0;
 
-  ConstString g_segment_name_TEXT = GetSegmentNameTEXT();
-  ConstString g_segment_name_DATA = GetSegmentNameDATA();
-  ConstString g_segment_name_DATA_DIRTY = GetSegmentNameDATA_DIRTY();
-  ConstString g_segment_name_DATA_CONST = GetSegmentNameDATA_CONST();
-  ConstString g_segment_name_OBJC = GetSegmentNameOBJC();
-  ConstString g_section_name_eh_frame = GetSectionNameEHFrame();
-  SectionSP text_section_sp(
-      section_list->FindSectionByName(g_segment_name_TEXT));
-  SectionSP data_section_sp(
-      section_list->FindSectionByName(g_segment_name_DATA));
-  SectionSP data_dirty_section_sp(
-      section_list->FindSectionByName(g_segment_name_DATA_DIRTY));
-  SectionSP data_const_section_sp(
-      section_list->FindSectionByName(g_segment_name_DATA_CONST));
-  SectionSP objc_section_sp(
-      section_list->FindSectionByName(g_segment_name_OBJC));
-  SectionSP eh_frame_section_sp;
-  if (text_section_sp.get())
-    eh_frame_section_sp = text_section_sp->GetChildren().FindSectionByName(
-        g_section_name_eh_frame);
-  else
-    eh_frame_section_sp =
-        section_list->FindSectionByName(g_section_name_eh_frame);
-
   const bool is_arm = (m_header.cputype == llvm::MachO::CPU_TYPE_ARM);
   const bool always_thumb = GetArchitecture().IsAlwaysThumbInstructions();
 
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
index 27b2078b5a3fc..7f67f5e04f1d6 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.h
@@ -286,6 +286,7 @@ class ObjectFileMachO : public lldb_private::ObjectFile {
   static lldb_private::ConstString GetSegmentNameDWARF();
   static lldb_private::ConstString GetSegmentNameLLVM_COV();
   static lldb_private::ConstString GetSectionNameEHFrame();
+  static lldb_private::ConstString GetSectionNameLLDBNoNlist();
 
   llvm::MachO::dysymtab_command m_dysymtab;
   std::vector<llvm::MachO::section_64> m_mach_sections;
diff --git a/lldb/test/API/macosx/no-nlist-memory-module/Makefile b/lldb/test/API/macosx/no-nlist-memory-module/Makefile
new file mode 100644
index 0000000000000..cb4f9968112cd
--- /dev/null
+++ b/lldb/test/API/macosx/no-nlist-memory-module/Makefile
@@ -0,0 +1,15 @@
+C_SOURCES := main.c
+LD_EXTRAS = -Wl,-rpath "-Wl,$(shell pwd)" -L. -lno-nlists
+
+.PHONY: build-libno-nlists
+all: build-libno-nlists a.out
+
+include Makefile.rules
+
+build-libno-nlists: no-nlists.c no-nlist-sect.s
+	$(CC) $(CFLAGS) -c -o no-nlists.o $(<D)/no-nlists.c
+	$(CC) $(CFLAGS) -c -o no-nlist-sect.o $(<D)/no-nlist-sect.s
+	$(LD) -dynamiclib -o libno-nlists.dylib no-nlists.o no-nlist-sect.o -install_name "@executable_path/libno-nlists.dylib"
+
+clean::
+	rm -rf no-nlists.o no-nlist-sect.o main.o main.s a.out a.out.dSYM libno-nlists.dylib libno-nlists.dylib.dSYM
diff --git a/lldb/test/API/macosx/no-nlist-memory-module/TestNoNlistsDylib.py b/lldb/test/API/macosx/no-nlist-memory-module/TestNoNlistsDylib.py
new file mode 100644
index 0000000000000..73644a5f00531
--- /dev/null
+++ b/lldb/test/API/macosx/no-nlist-memory-module/TestNoNlistsDylib.py
@@ -0,0 +1,70 @@
+"""
+Test that we read don't read the nlist symbols for a specially marked dylib
+when read from memory.
+"""
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+from time import sleep
+
+
+class NoNlistsTestCase(TestBase):
+    NO_DEBUG_INFO_TESTCASE = True
+
+    @skipIfRemote
+    @skipUnlessDarwin
+    def test_no_nlist_symbols(self):
+        self.build()
+
+        exe = os.path.realpath(self.getBuildArtifact("a.out"))
+
+        popen = self.spawnSubprocess(exe)
+        pid = popen.pid
+
+        self.dbg.SetAsync(False)
+
+        m = lldb.SBModule()
+        target = lldb.SBTarget()
+        process = lldb.SBProcess()
+        reattach_count = 0
+
+        # Attach to the process, see if we have a memory module
+        # for libno-nlists.dylib.  If not, detach, delete the
+        # Target, and flush the orphaned modules from the Debugger
+        # so we don't hold on to a reference of the on-disk binary.
+
+        # If we haven't succeeded after ten attemps of attaching and
+        # detaching, fail the test.
+        while not m.IsValid() or m.IsFileBacked():
+            if process.IsValid():
+                process.Detach()
+                self.dbg.DeleteTarget(target)
+                self.dbg.MemoryPressureDetected()
+                time.sleep(2)
+
+            self.runCmd("process attach -p " + str(pid))
+            target = self.dbg.GetSelectedTarget()
+            process = target.GetProcess()
+            m = target.FindModule(lldb.SBFileSpec("libno-nlists.dylib"))
+
+            reattach_count = reattach_count + 1
+            if reattach_count > 10:
+                break
+
+        self.assertTrue(process, PROCESS_IS_VALID)
+
+        # Test that we found libno-nlists.dylib, it is a memory
+        # module, and that it has no symbols.
+        self.assertTrue(m.IsValid())
+        self.assertFalse(m.IsFileBacked())
+        self.assertEqual(m.GetNumSymbols(), 0)
+
+        # And as a sanity check, get the main binary's module,
+        # test that it is file backed and that it has more than
+        # zero symbols.
+        m = target.FindModule(lldb.SBFileSpec("a.out"))
+        self.assertTrue(m.IsValid())
+        self.assertTrue(m.IsFileBacked())
+        self.assertGreater(m.GetNumSymbols(), 0)
diff --git a/lldb/test/API/macosx/no-nlist-memory-module/main.c b/lldb/test/API/macosx/no-nlist-memory-module/main.c
new file mode 100644
index 0000000000000..fb1451ccc99d1
--- /dev/null
+++ b/lldb/test/API/macosx/no-nlist-memory-module/main.c
@@ -0,0 +1,37 @@
+#include <libgen.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/param.h>
+#include <sys/stat.h>
+#include <unistd.h>
+
+int get_return_value();
+
+int main(int argc, char **argv) {
+
+  // Remove libno-nlists.dylib that we are linked against.
+  char executable_path[PATH_MAX];
+  realpath(argv[0], executable_path);
+  executable_path[PATH_MAX - 1] = '\0';
+
+  char *dir = dirname(executable_path);
+  char dylib_path[PATH_MAX];
+  snprintf(dylib_path, PATH_MAX, "%s/%s", dir, "libno-nlists.dylib");
+  dylib_path[PATH_MAX - 1] = '\0';
+  struct stat sb;
+  if (stat(dylib_path, &sb) == -1) {
+    printf("Could not find dylib %s to remove it\n", dylib_path);
+    exit(1);
+  }
+  if (unlink(dylib_path) == -1) {
+    printf("Could not remove dylib %s\n", dylib_path);
+    exit(2);
+  }
+
+  // This sleep will exit as soon as lldb attaches
+  // and interrupts it.
+  sleep(200);
+
+  int retval = get_return_value();
+  return retval;
+}
diff --git a/lldb/test/API/macosx/no-nlist-memory-module/no-nlist-sect.s b/lldb/test/API/macosx/no-nlist-memory-module/no-nlist-sect.s
new file mode 100644
index 0000000000000..0a7c974f9362c
--- /dev/null
+++ b/lldb/test/API/macosx/no-nlist-memory-module/no-nlist-sect.s
@@ -0,0 +1,3 @@
+   .section __TEXT,__lldb_no_nlist,regular,pure_instructions
+   .p2align 2
+   .byte 0x10
diff --git a/lldb/test/API/macosx/no-nlist-memory-module/no-nlists.c b/lldb/test/API/macosx/no-nlist-memory-module/no-nlists.c
new file mode 100644
index 0000000000000..e75c5a5274bc0
--- /dev/null
+++ b/lldb/test/API/macosx/no-nlist-memory-module/no-nlists.c
@@ -0,0 +1,3 @@
+int get_return_value() {
+  return 10;
+}

@jasonmolenda
Copy link
Collaborator Author

The change to ObjectFileMachO looks a little larger than it really is because i moved the SectionSP initializations ~100 lines earlier in ParseSymtab than they were. I'm scanning for this new section the same way we scan for eh_frame. ObjectFileMachO already had "load level" overrides for reading ObjectFiles out of memory, so it was a one line change to use that mechanism for this specially marked binary, once we'd detected the section.

In addition to a memory Module which has the __TEXT,__lldb_no_nlist
section indicating that we do not read the symbol table for this
binary, I added a second library which is also a memory Module but
no section -- test that lldb did read the symbol table for this
binary.
$(LD) -dynamiclib -o libhas-nlists.dylib has-nlists.o -install_name "@executable_path/libhas-nlists.dylib"

clean::
rm -rf has-nlists.o no-nlists.o no-nlist-sect.o main.o main.s a.out a.out.dSYM libno-nlists.dylib libno-nlists.dylib.dSYM libhas-nlists.dylib libhas-nlists.dylib.dSYM
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier to rm -rf *.o *.dSYM *.dylib?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah tbh I don't think we ever even run the clean targets these days, I just stuck this in there out of habit. good suggestion.

# modules from the Debugger so we don't hold on to a reference
# of the on-disk binary.

# If we haven't succeeded after ten attemps of attaching and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the need for this loop. Why would this fail the first time and succeed the second time? Is this trying to work around the debugger running in asynchronous mode or something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured out the answer when reading the test below: because the inferior is removing the dylib and there's no synchronization with the test.

But that leads to another question: why are you not doing this from Python in the test itself? Seems like that would simplify things a lot (the test could just sleep forever) and allow you to do the attach when you know the program is in the expected state?

Copy link
Collaborator Author

@jasonmolenda jasonmolenda Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing in python first -- but I ended up removing the dylib before it loaded in the process, and it crashed starting up. I'd have to add a sleep or some other sync to guarantee that the inferior had started and loaded the dylib before I remove it. (and sleeps are really unstable when a test gets run on a possibly-very-slow CI, of course. on any reasonable computer doing a sleep(1) in the python would let the process be launched and running, but I don't trust that kind of pattern with CI after years of seeing very unusual perf behavior)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I understand the challenge: you can't use lldb to synchronize (e.g. by waiting on a pause()) because by attaching, you're already going to be parsing the library. We have a few other tests that have this issue, and they use a file to synchronize through lldbutil.wait_for_file_on_target.

        # Use a file as a synchronization point between test and inferior.
        pid_file_path = lldbutil.append_to_process_working_directory(
            self, "pid_file_%d" % (int(time.time()))
        )
        self.addTearDownHook(
            lambda: self.run_platform_command("rm %s" % (pid_file_path))
        )

        popen = self.spawnSubprocess(exe, [pid_file_path])

        pid = lldbutil.wait_for_file_on_target(self, pid_file_path)

I think we could use the same approach here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

@jasonmolenda
Copy link
Collaborator Author

I adopted the synchronization scheme that we do in other attach tests -- the python provides a filename that the inferior should create with its pid, we launch it and wait until that file exists (indicating that the inferior is done setting up) and then the python tests execute.

I changed the code which removes the two dylibs (so lldb is forced to read them out of memory) from the inferior main.c to the Python, where it is simpler. I think we're good to go at this point.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think you could simplify the test even further by using C++ but on the other hand the C tests compile a little faster.

std::ofstream pid_file;
pid_file.open (pid_file_path);
pid_file << getpid(); 
pid_file.close();

@jasonmolenda
Copy link
Collaborator Author

LGTM. I think you could simplify the test even further by using C++ but on the other hand the C tests compile a little faster.

yeah I was going to rewrite it into slightly nicer C but then I realized there was more value in doing it exactly the same as the other test that I copied it from, so they could be both found by a simple search.

@jasonmolenda jasonmolenda merged commit 397696b into llvm:main Mar 7, 2025
8 of 9 checks passed
@jasonmolenda jasonmolenda deleted the dont-read-nlist-entries-for-specially-marked-binary branch March 7, 2025 00:34
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 7, 2025
…vm#129967)

We have a binary image on Darwin that has no code, only metadata. It has
a large symbol table with many external symbol names that will not be
needed in the debugger. And it is possible to not have this binary on
the debugger system - so lldb must read all of the symbol names out of
memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to
indicate that lldb should not read the nlist symbols for it when we are
reading out of memory. If lldb is run with an on-disk version of the
binary, we will load the symbol table as we normally would, there's no
benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section,
launch the process. The main binary deletes the dylib from the disk so
lldb is forced to read it out of memory. lldb attaches to the binary,
confirms that the dylib is present in the process and is a memory
Module. If the binary is not present, or lldb found the on-disk copy
because it hasn't been deleted yet, we delete the target, flush the
Debugger's module cache, sleep and retry, up to ten times. I create the
specially named section by compiling an assembly file that puts a byte
in the section which makes for a bit of a messy Makefile (the pre-canned
actions to build a dylib don't quite handle this case) but I don't think
it's much of a problem. This is a purely skipUnlessDarwin test case.

rdar://146167816
(cherry picked from commit 397696b)
jasonmolenda added a commit that referenced this pull request Mar 7, 2025
…nary (#129967)"

This reverts commit 397696b.

This breaks the macOS CI bots, I need to use $LDFLAGS in the $LD
invocation when building the dylib to get the dylibs to build on
the CI bots.  But I've added "-lno-nlists -lhas-nlists" to the LDFLAGS
for the main binary in the same directory, so using LDFLAGS will
result in a compile error for the dylibs.  I'll need to build the
dylibs in a subdir with a different Makefile, will reland with that
change in a bit.
jasonmolenda added a commit that referenced this pull request Mar 7, 2025
…29967)

We have a binary image on Darwin that has no code, only metadata. It has
a large symbol table with many external symbol names that will not be
needed in the debugger. And it is possible to not have this binary on
the debugger system - so lldb must read all of the symbol names out of
memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to
indicate that lldb should not read the nlist symbols for it when we are
reading out of memory. If lldb is run with an on-disk version of the
binary, we will load the symbol table as we normally would, there's no
benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section,
launch the process. The main binary deletes the dylib from the disk so
lldb is forced to read it out of memory. lldb attaches to the binary,
confirms that the dylib is present in the process and is a memory
Module. If the binary is not present, or lldb found the on-disk copy
because it hasn't been deleted yet, we delete the target, flush the
Debugger's module cache, sleep and retry, up to ten times. I create the
specially named section by compiling an assembly file that puts a byte
in the section which makes for a bit of a messy Makefile (the pre-canned
actions to build a dylib don't quite handle this case) but I don't think
it's much of a problem. This is a purely skipUnlessDarwin test case.

Relanding this change with a restructured Makefiles for the test case
that should pass on the CI bots.

rdar://146167816
jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Mar 7, 2025
…vm#129967)

We have a binary image on Darwin that has no code, only metadata. It has
a large symbol table with many external symbol names that will not be
needed in the debugger. And it is possible to not have this binary on
the debugger system - so lldb must read all of the symbol names out of
memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to
indicate that lldb should not read the nlist symbols for it when we are
reading out of memory. If lldb is run with an on-disk version of the
binary, we will load the symbol table as we normally would, there's no
benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section,
launch the process. The main binary deletes the dylib from the disk so
lldb is forced to read it out of memory. lldb attaches to the binary,
confirms that the dylib is present in the process and is a memory
Module. If the binary is not present, or lldb found the on-disk copy
because it hasn't been deleted yet, we delete the target, flush the
Debugger's module cache, sleep and retry, up to ten times. I create the
specially named section by compiling an assembly file that puts a byte
in the section which makes for a bit of a messy Makefile (the pre-canned
actions to build a dylib don't quite handle this case) but I don't think
it's much of a problem. This is a purely skipUnlessDarwin test case.

Relanding this change with a restructured Makefiles for the test case
that should pass on the CI bots.

rdar://146167816
(cherry picked from commit 1a31bb3)
@jasonmolenda
Copy link
Collaborator Author

FTR on the CI bot my API test did not make correctly. I reverted the PR, rewrote the makefiles for the API test, relanded.

JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Mar 7, 2025
…t-records-from-memory-for-specially-marked-binary-6.1

[lldb][Mach-O] Don't read symbol table of specially marked binary (llvm#129967)
@labath
Copy link
Collaborator

labath commented Mar 10, 2025

I create the specially named section by compiling an assembly file that puts a byte in the section which makes for a bit of a messy Makefile

You probably don't need an asm file for that. There are at least two ways to generate a random section from C:

asm(R"(.section ".text.foo"; .byte 0)");

void  __attribute__((section(".text.bar"))) f(){}

jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
…vm#129967)

We have a binary image on Darwin that has no code, only metadata. It has
a large symbol table with many external symbol names that will not be
needed in the debugger. And it is possible to not have this binary on
the debugger system - so lldb must read all of the symbol names out of
memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to
indicate that lldb should not read the nlist symbols for it when we are
reading out of memory. If lldb is run with an on-disk version of the
binary, we will load the symbol table as we normally would, there's no
benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section,
launch the process. The main binary deletes the dylib from the disk so
lldb is forced to read it out of memory. lldb attaches to the binary,
confirms that the dylib is present in the process and is a memory
Module. If the binary is not present, or lldb found the on-disk copy
because it hasn't been deleted yet, we delete the target, flush the
Debugger's module cache, sleep and retry, up to ten times. I create the
specially named section by compiling an assembly file that puts a byte
in the section which makes for a bit of a messy Makefile (the pre-canned
actions to build a dylib don't quite handle this case) but I don't think
it's much of a problem. This is a purely skipUnlessDarwin test case.

rdar://146167816
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
…nary (llvm#129967)"

This reverts commit 397696b.

This breaks the macOS CI bots, I need to use $LDFLAGS in the $LD
invocation when building the dylib to get the dylibs to build on
the CI bots.  But I've added "-lno-nlists -lhas-nlists" to the LDFLAGS
for the main binary in the same directory, so using LDFLAGS will
result in a compile error for the dylibs.  I'll need to build the
dylibs in a subdir with a different Makefile, will reland with that
change in a bit.
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
…vm#129967)

We have a binary image on Darwin that has no code, only metadata. It has
a large symbol table with many external symbol names that will not be
needed in the debugger. And it is possible to not have this binary on
the debugger system - so lldb must read all of the symbol names out of
memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to
indicate that lldb should not read the nlist symbols for it when we are
reading out of memory. If lldb is run with an on-disk version of the
binary, we will load the symbol table as we normally would, there's no
benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section,
launch the process. The main binary deletes the dylib from the disk so
lldb is forced to read it out of memory. lldb attaches to the binary,
confirms that the dylib is present in the process and is a memory
Module. If the binary is not present, or lldb found the on-disk copy
because it hasn't been deleted yet, we delete the target, flush the
Debugger's module cache, sleep and retry, up to ten times. I create the
specially named section by compiling an assembly file that puts a byte
in the section which makes for a bit of a messy Makefile (the pre-canned
actions to build a dylib don't quite handle this case) but I don't think
it's much of a problem. This is a purely skipUnlessDarwin test case.

Relanding this change with a restructured Makefiles for the test case
that should pass on the CI bots.

rdar://146167816
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants