Skip to content

[lldb][Mach-O] Read dyld_all_image_infos addr from main bin spec LC_NOTE #127156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jasonmolenda
Copy link
Collaborator

Mach-O corefiles have LC_NOTE metadata, one LC_NOTE that lldb recognizes is main bin spec which can specify that this is a kernel corefile, userland corefile, or firmware/standalone corefile. With a userland corefile, the LC_NOTE would specify the virtual address of the dyld binary's Mach-O header. lldb would create a Module from that in-memory binary, find the dyld_all_image_infos object in dyld's DATA segment, and use that object to find all of the binaries present in the corefile.

ProcessMachCore takes the metadata from this LC_NOTE and passes the address to the DynamicLoader plugin via its GetImageInfoAddress() method, so the DynamicLoader can find all of the binaries and load them in the Target at their correct virtual addresses.

We have a corefile creator who would prefer to specify the address of dyld_all_image_infos directly, instead of specifying the address of dyld and parsing that to find the object. DynamicLoaderMacOSX, the DynamicLoader plugin being used here, will accept either a dyld virtual address or a dyld_all_image_infos virtual address from ProcessMachCore, and do the correct thing with either value.

lldb's process save-core mach-o corefile reader will continue to specify the virtual address of the dyld binary.

rdar://144322688

…_NOTE

Mach-O corefiles have LC_NOTE metadata, one LC_NOTE that lldb
recognizes is `main bin spec` which can specify that this is a
kernel corefile, userland corefile, or firmware/standalone corefile.
With a userland corefile, the LC_NOTE would specify the virtual
address of the dyld binary's Mach-O header.  lldb would create a
Module from that in-memory binary, find the `dyld_all_image_infos`
object in dyld's DATA segment, and use that object to find all of
the binaries present in the corefile.

ProcessMachCore takes the metadata from this LC_NOTE and passes the
address to the DynamicLoader plugin via its `GetImageInfoAddress()`
method, so the DynamicLoader can find all of the binaries and load
them in the Target at their correct virtual addresses.

We have a corefile creator who would prefer to specify the address
of `dyld_all_image_infos` directly, instead of specifying the address
of dyld and parsing that to find the object.  DynamicLoaderMacOSX,
the DynamicLoader plugin being used here, will accept either a
dyld virtual address or a `dyld_all_image_infos` virtual address
from ProcessMachCore, and do the correct thing with either value.

lldb's process save-core mach-o corefile reader will continue to
specify the virtual address of the dyld binary.

rdar://144322688
@llvmbot
Copy link
Member

llvmbot commented Feb 14, 2025

@llvm/pr-subscribers-lldb

Author: Jason Molenda (jasonmolenda)

Changes

Mach-O corefiles have LC_NOTE metadata, one LC_NOTE that lldb recognizes is main bin spec which can specify that this is a kernel corefile, userland corefile, or firmware/standalone corefile. With a userland corefile, the LC_NOTE would specify the virtual address of the dyld binary's Mach-O header. lldb would create a Module from that in-memory binary, find the dyld_all_image_infos object in dyld's DATA segment, and use that object to find all of the binaries present in the corefile.

ProcessMachCore takes the metadata from this LC_NOTE and passes the address to the DynamicLoader plugin via its GetImageInfoAddress() method, so the DynamicLoader can find all of the binaries and load them in the Target at their correct virtual addresses.

We have a corefile creator who would prefer to specify the address of dyld_all_image_infos directly, instead of specifying the address of dyld and parsing that to find the object. DynamicLoaderMacOSX, the DynamicLoader plugin being used here, will accept either a dyld virtual address or a dyld_all_image_infos virtual address from ProcessMachCore, and do the correct thing with either value.

lldb's process save-core mach-o corefile reader will continue to specify the virtual address of the dyld binary.

rdar://144322688


Full diff: https://github.com/llvm/llvm-project/pull/127156.diff

4 Files Affected:

  • (modified) lldb/include/lldb/Symbol/ObjectFile.h (+6-3)
  • (modified) lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp (+9-1)
  • (modified) lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp (+37-8)
  • (modified) lldb/source/Plugins/Process/mach-core/ProcessMachCore.h (+1)
diff --git a/lldb/include/lldb/Symbol/ObjectFile.h b/lldb/include/lldb/Symbol/ObjectFile.h
index d89314d44bf67..8873209eeece6 100644
--- a/lldb/include/lldb/Symbol/ObjectFile.h
+++ b/lldb/include/lldb/Symbol/ObjectFile.h
@@ -81,9 +81,12 @@ class ObjectFile : public std::enable_shared_from_this<ObjectFile>,
   enum BinaryType {
     eBinaryTypeInvalid = 0,
     eBinaryTypeUnknown,
-    eBinaryTypeKernel,    /// kernel binary
-    eBinaryTypeUser,      /// user process binary
-    eBinaryTypeStandalone /// standalone binary / firmware
+    eBinaryTypeKernel,            /// kernel binary
+    eBinaryTypeUser,              /// user process binary,
+                                  /// dyld addr
+    eBinaryTypeUserAllImageInfos, /// user process binary,
+                                  /// dyld_all_image_infos addr
+    eBinaryTypeStandalone         /// standalone binary / firmware
   };
 
   struct LoadableData {
diff --git a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
index 4e356a7c8f5d9..8cf6ed268f3b8 100644
--- a/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
+++ b/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp
@@ -5599,9 +5599,13 @@ bool ObjectFileMachO::GetCorefileMainBinaryInfo(addr_t &value,
       // struct main_bin_spec
       // {
       //     uint32_t version;       // currently 2
-      //     uint32_t type;          // 0 == unspecified, 1 == kernel,
+      //     uint32_t type;          // 0 == unspecified,
+      //                             // 1 == kernel
       //                             // 2 == user process,
+      //                                     dyld mach-o binary addr
       //                             // 3 == standalone binary
+      //                             // 4 == user process,
+      //                             //      dyld_all_image_infos addr
       //     uint64_t address;       // UINT64_MAX if address not specified
       //     uint64_t slide;         // slide, UINT64_MAX if unspecified
       //                             // 0 if no slide needs to be applied to
@@ -5669,6 +5673,10 @@ bool ObjectFileMachO::GetCorefileMainBinaryInfo(addr_t &value,
             type = eBinaryTypeStandalone;
             typestr = "standalone";
             break;
+          case 4:
+            type = eBinaryTypeUserAllImageInfos;
+            typestr = "userland dyld_all_image_infos";
+            break;
           }
           LLDB_LOGF(log,
                     "LC_NOTE 'main bin spec' found, version %d type %d "
diff --git a/lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp b/lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp
index eef9bd4a175ec..281f3a0db8f69 100644
--- a/lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp
+++ b/lldb/source/Plugins/Process/mach-core/ProcessMachCore.cpp
@@ -114,6 +114,7 @@ ProcessMachCore::ProcessMachCore(lldb::TargetSP target_sp,
     : PostMortemProcess(target_sp, listener_sp, core_file), m_core_aranges(),
       m_core_range_infos(), m_core_module_sp(),
       m_dyld_addr(LLDB_INVALID_ADDRESS),
+      m_dyld_all_image_infos_addr(LLDB_INVALID_ADDRESS),
       m_mach_kernel_addr(LLDB_INVALID_ADDRESS) {}
 
 // Destructor
@@ -320,6 +321,9 @@ bool ProcessMachCore::LoadBinariesViaMetadata() {
     } else if (type == ObjectFile::eBinaryTypeUser) {
       m_dyld_addr = objfile_binary_value;
       m_dyld_plugin_name = DynamicLoaderMacOSXDYLD::GetPluginNameStatic();
+    } else if (type == ObjectFile::eBinaryTypeUserAllImageInfos) {
+      m_dyld_all_image_infos_addr = objfile_binary_value;
+      m_dyld_plugin_name = DynamicLoaderMacOSXDYLD::GetPluginNameStatic();
     } else {
       const bool force_symbol_search = true;
       const bool notify = true;
@@ -466,6 +470,7 @@ void ProcessMachCore::LoadBinariesViaExhaustiveSearch() {
   addr_t saved_user_dyld_addr = m_dyld_addr;
   m_mach_kernel_addr = LLDB_INVALID_ADDRESS;
   m_dyld_addr = LLDB_INVALID_ADDRESS;
+  m_dyld_all_image_infos_addr = LLDB_INVALID_ADDRESS;
 
   addr_t better_kernel_address =
       DynamicLoaderDarwinKernel::SearchForDarwinKernel(this);
@@ -507,6 +512,12 @@ void ProcessMachCore::LoadBinariesAndSetDYLD() {
                   "image at 0x%" PRIx64,
                   __FUNCTION__, m_dyld_addr);
         m_dyld_plugin_name = DynamicLoaderMacOSXDYLD::GetPluginNameStatic();
+      } else if (m_dyld_all_image_infos_addr != LLDB_INVALID_ADDRESS) {
+        LLDB_LOGF(log,
+                  "ProcessMachCore::%s: Using user process dyld "
+                  "dyld_all_image_infos at 0x%" PRIx64,
+                  __FUNCTION__, m_dyld_all_image_infos_addr);
+        m_dyld_plugin_name = DynamicLoaderMacOSXDYLD::GetPluginNameStatic();
       }
     } else {
       if (m_dyld_addr != LLDB_INVALID_ADDRESS) {
@@ -515,6 +526,11 @@ void ProcessMachCore::LoadBinariesAndSetDYLD() {
                   "image at 0x%" PRIx64,
                   __FUNCTION__, m_dyld_addr);
         m_dyld_plugin_name = DynamicLoaderMacOSXDYLD::GetPluginNameStatic();
+      } else if (m_dyld_all_image_infos_addr != LLDB_INVALID_ADDRESS) {
+        LLDB_LOGF(log,
+                  "ProcessMachCore::%s: Using user process dyld "
+                  "dyld_all_image_infos at 0x%" PRIx64,
+                  __FUNCTION__, m_dyld_all_image_infos_addr);
       } else if (m_mach_kernel_addr != LLDB_INVALID_ADDRESS) {
         LLDB_LOGF(log,
                   "ProcessMachCore::%s: Using kernel "
@@ -763,19 +779,32 @@ void ProcessMachCore::Initialize() {
 }
 
 addr_t ProcessMachCore::GetImageInfoAddress() {
-  // If we found both a user-process dyld and a kernel binary, we need to
-  // decide which to prefer.
+  // The DynamicLoader plugin will call back in to this Process
+  // method to find the virtual address of one of these:
+  //   1. The xnu mach kernel binary Mach-O header
+  //   2. The dyld binary Mach-O header
+  //   3. dyld's dyld_all_image_infos object
+  //
+  //  DynamicLoaderMacOSX will accept either the dyld Mach-O header
+  //  address or the dyld_all_image_infos interchangably, no need
+  //  to distinguish between them.  It disambiguates by the Mach-O
+  //  file magic number at the start.
   if (GetCorefilePreference() == eKernelCorefile) {
-    if (m_mach_kernel_addr != LLDB_INVALID_ADDRESS) {
+    if (m_mach_kernel_addr != LLDB_INVALID_ADDRESS)
       return m_mach_kernel_addr;
-    }
-    return m_dyld_addr;
+    if (m_dyld_addr != LLDB_INVALID_ADDRESS)
+      return m_dyld_addr;
   } else {
-    if (m_dyld_addr != LLDB_INVALID_ADDRESS) {
+    if (m_dyld_addr != LLDB_INVALID_ADDRESS)
       return m_dyld_addr;
-    }
-    return m_mach_kernel_addr;
+    if (m_mach_kernel_addr != LLDB_INVALID_ADDRESS)
+      return m_mach_kernel_addr;
   }
+
+  // m_dyld_addr and m_mach_kernel_addr both
+  // invalid, return m_dyld_all_image_infos_addr
+  // in case it has a useful value.
+  return m_dyld_all_image_infos_addr;
 }
 
 lldb_private::ObjectFile *ProcessMachCore::GetCoreObjectFile() {
diff --git a/lldb/source/Plugins/Process/mach-core/ProcessMachCore.h b/lldb/source/Plugins/Process/mach-core/ProcessMachCore.h
index 8996ae116614b..6ba9f2354edf9 100644
--- a/lldb/source/Plugins/Process/mach-core/ProcessMachCore.h
+++ b/lldb/source/Plugins/Process/mach-core/ProcessMachCore.h
@@ -131,6 +131,7 @@ class ProcessMachCore : public lldb_private::PostMortemProcess {
   VMRangeToPermissions m_core_range_infos;
   lldb::ModuleSP m_core_module_sp;
   lldb::addr_t m_dyld_addr;
+  lldb::addr_t m_dyld_all_image_infos_addr;
   lldb::addr_t m_mach_kernel_addr;
   llvm::StringRef m_dyld_plugin_name;
 };

@jasonmolenda
Copy link
Collaborator Author

I wanted to add a new eBinaryType enum from the ObjectFile to describe what address is being returned. But within ProcessMachCore, there was no need to add the distinction between the existing m_dyld_addr and the new m_dyld_all_image_infos_addr, I did it to keep the meaning of the address clear at this layer, and to make logging more clear. But ProcessMachCore returns either address to DynamicLoaderMacOSX, and it sniffs for a Mach-O magic number in the first 4 bytes to disambiguate between them already. I could have ProcessMachCore put either value in m_dyld_addr and everything would still work.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

case 4:
type = eBinaryTypeUserAllImageInfos;
typestr = "userland dyld_all_image_infos";
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a default case that sets type to eBinaryTypeInvalid?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do, it might be useful to add a log. If that situation were to happen, you'd see it in the logs immediately and know that either the core file looks unexpected or LLDB has a bug in core file handling.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typestr is set to "unrecognized type" and will be printed if logging is enabled, but the type enum is uninitialized and will also be printed, which could be confusing. I'll add an initialization. This method doesn't currently take a Target or Process pointer so I can't print a message to the user asynchronously, it'll require them to enable logging when something is going wrong to find it.

Copy link

github-actions bot commented Feb 18, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

the logging message is clearer when an unknown
type is found in the main bin spec.
@jasonmolenda jasonmolenda merged commit 1f5edb1 into llvm:main Feb 18, 2025
5 of 6 checks passed
@jasonmolenda jasonmolenda deleted the accept-dyld_all_image_infos-address-in-main-bin-spec-LC_NOTE branch February 18, 2025 20:41
@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 18, 2025

LLVM Buildbot has detected a new failure on builder lldb-arm-ubuntu running on linaro-lldb-arm-ubuntu while building lldb at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/11664

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: terminal/TestSTTYBeforeAndAfter.py (1128 of 2905)
PASS: lldb-api :: test_utils/TestDecorators.py (1129 of 2905)
PASS: lldb-api :: test_utils/TestInlineTest.py (1130 of 2905)
PASS: lldb-api :: test_utils/TestPExpectTest.py (1131 of 2905)
PASS: lldb-api :: test_utils/base/TestBaseTest.py (1132 of 2905)
PASS: lldb-api :: python_api/watchpoint/watchlocation/TestTargetWatchAddress.py (1133 of 2905)
PASS: lldb-api :: terminal/TestEditline.py (1134 of 2905)
UNSUPPORTED: lldb-api :: tools/lldb-dap/breakpoint-events/TestDAP_breakpointEvents.py (1135 of 2905)
PASS: lldb-api :: tools/lldb-dap/attach/TestDAP_attachByPortNum.py (1136 of 2905)
PASS: lldb-api :: tools/lldb-dap/breakpoint/TestDAP_breakpointLocations.py (1137 of 2905)
FAIL: lldb-api :: tools/lldb-dap/attach/TestDAP_attach.py (1138 of 2905)
******************** TEST 'lldb-api :: tools/lldb-dap/attach/TestDAP_attach.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --arch armv8l --build-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-arm-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-arm-ubuntu/llvm-project/lldb/test/API/tools/lldb-dap/attach -p TestDAP_attach.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 1f5edb17b23f5ac0576f83a6c122ce38bd5ec18e)
  clang revision 1f5edb17b23f5ac0576f83a6c122ce38bd5ec18e
  llvm revision 1f5edb17b23f5ac0576f83a6c122ce38bd5ec18e
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']
========= DEBUG ADAPTER PROTOCOL LOGS =========
1739912216.176085234 --> 
Content-Length: 344

{
  "arguments": {
    "adapterID": "lldb-native",
    "clientID": "vscode",
    "columnsStartAt1": true,
    "linesStartAt1": true,
    "locale": "en-us",
    "pathFormat": "path",
    "sourceInitFile": false,
    "supportsRunInTerminalRequest": true,
    "supportsStartDebuggingRequest": true,
    "supportsVariablePaging": true,
    "supportsVariableType": true
  },
  "command": "initialize",
  "seq": 1,
  "type": "request"
}
1739912216.180476904 <-- 
Content-Length: 1631


jasonmolenda added a commit to jasonmolenda/llvm-project that referenced this pull request Feb 18, 2025
…_NOTE (llvm#127156)

Mach-O corefiles have LC_NOTE metadata, one LC_NOTE that lldb recognizes
is `main bin spec` which can specify that this is a kernel corefile,
userland corefile, or firmware/standalone corefile. With a userland
corefile, the LC_NOTE would specify the virtual address of the dyld
binary's Mach-O header. lldb would create a Module from that in-memory
binary, find the `dyld_all_image_infos` object in dyld's DATA segment,
and use that object to find all of the binaries present in the corefile.

ProcessMachCore takes the metadata from this LC_NOTE and passes the
address to the DynamicLoader plugin via its `GetImageInfoAddress()`
method, so the DynamicLoader can find all of the binaries and load them
in the Target at their correct virtual addresses.

We have a corefile creator who would prefer to specify the address of
`dyld_all_image_infos` directly, instead of specifying the address of
dyld and parsing that to find the object. DynamicLoaderMacOSX, the
DynamicLoader plugin being used here, will accept either a dyld virtual
address or a `dyld_all_image_infos` virtual address from
ProcessMachCore, and do the correct thing with either value.

lldb's process save-core mach-o corefile reader will continue to specify
the virtual address of the dyld binary.

rdar://144322688
(cherry picked from commit 1f5edb1)
wldfngrs pushed a commit to wldfngrs/llvm-project that referenced this pull request Feb 19, 2025
…_NOTE (llvm#127156)

Mach-O corefiles have LC_NOTE metadata, one LC_NOTE that lldb recognizes
is `main bin spec` which can specify that this is a kernel corefile,
userland corefile, or firmware/standalone corefile. With a userland
corefile, the LC_NOTE would specify the virtual address of the dyld
binary's Mach-O header. lldb would create a Module from that in-memory
binary, find the `dyld_all_image_infos` object in dyld's DATA segment,
and use that object to find all of the binaries present in the corefile.

ProcessMachCore takes the metadata from this LC_NOTE and passes the
address to the DynamicLoader plugin via its `GetImageInfoAddress()`
method, so the DynamicLoader can find all of the binaries and load them
in the Target at their correct virtual addresses.

We have a corefile creator who would prefer to specify the address of
`dyld_all_image_infos` directly, instead of specifying the address of
dyld and parsing that to find the object. DynamicLoaderMacOSX, the
DynamicLoader plugin being used here, will accept either a dyld virtual
address or a `dyld_all_image_infos` virtual address from
ProcessMachCore, and do the correct thing with either value.

lldb's process save-core mach-o corefile reader will continue to specify
the virtual address of the dyld binary.

rdar://144322688
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants