Skip to content

Ensure that the executable module is ModuleList[0] #78360

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 17, 2024

Conversation

jimingham
Copy link
Collaborator

We claim in a couple places that the zeroth element of the module list for a target is the main executable, but we don't actually enforce that in the ModuleList class. As we saw, for instance, in

32dd5b2

it's not all that hard to get this to be off. This patch ensures that the first object file of type Executable added to it is moved to the front of the ModuleList. I also added a test for this.

In the normal course of operation, where the executable is added first, this only adds a check for whether the first element in the module list is an executable. If that's true, we just append as normal.

Note, the code in Target::GetExecutableModule doesn't actually agree that the zeroth element must be the executable, it instead returns the first Module of type Executable. But I can't tell whether that was a change in intention or just working around the bug that we don't always maintain this ordering. But given we've said this in scripting as well as internally, I think we shouldn't change our minds about this.

We claim in a couple places that the zeroth element of the module
list for a target is the main executable, but we don't actually
enforce that in the ModuleList class.  As we saw, for instance, in

32dd5b2

it's not all that hard to get this to be off.  This patch ensures that
the first object file of type Executable added to it is moved to the front
of the ModuleList.  I also added a test for this.
@llvmbot
Copy link
Member

llvmbot commented Jan 16, 2024

@llvm/pr-subscribers-lldb

Author: None (jimingham)

Changes

We claim in a couple places that the zeroth element of the module list for a target is the main executable, but we don't actually enforce that in the ModuleList class. As we saw, for instance, in

32dd5b2

it's not all that hard to get this to be off. This patch ensures that the first object file of type Executable added to it is moved to the front of the ModuleList. I also added a test for this.

In the normal course of operation, where the executable is added first, this only adds a check for whether the first element in the module list is an executable. If that's true, we just append as normal.

Note, the code in Target::GetExecutableModule doesn't actually agree that the zeroth element must be the executable, it instead returns the first Module of type Executable. But I can't tell whether that was a change in intention or just working around the bug that we don't always maintain this ordering. But given we've said this in scripting as well as internally, I think we shouldn't change our minds about this.


Full diff: https://github.com/llvm/llvm-project/pull/78360.diff

5 Files Affected:

  • (modified) lldb/source/Core/ModuleList.cpp (+23-1)
  • (added) lldb/test/API/functionalities/executable_first/Makefile (+4)
  • (added) lldb/test/API/functionalities/executable_first/TestExecutableFirst.py (+46)
  • (added) lldb/test/API/functionalities/executable_first/b.cpp (+1)
  • (added) lldb/test/API/functionalities/executable_first/main.cpp (+6)
diff --git a/lldb/source/Core/ModuleList.cpp b/lldb/source/Core/ModuleList.cpp
index 2180f29f3694279..80299f80a278693 100644
--- a/lldb/source/Core/ModuleList.cpp
+++ b/lldb/source/Core/ModuleList.cpp
@@ -211,7 +211,29 @@ ModuleList::~ModuleList() = default;
 void ModuleList::AppendImpl(const ModuleSP &module_sp, bool use_notifier) {
   if (module_sp) {
     std::lock_guard<std::recursive_mutex> guard(m_modules_mutex);
-    m_modules.push_back(module_sp);
+    // We are required to keep the first element of the Module List as the
+    // executable module.  So check here and if the first module is NOT an 
+    // but the new one is, we insert this module at the beginning, rather than 
+    // at the end.
+    // We don't need to do any of this if the list is empty:
+    if (m_modules.empty()) {
+      m_modules.push_back(module_sp);
+    } else {
+      // Since producing the ObjectFile may take some work, first check the 0th
+      // element, and only if that's NOT an executable look at the incoming
+      // ObjectFile.  That way in the normal case we only look at the element
+      // 0 ObjectFile. 
+      const bool elem_zero_is_executable 
+          = m_modules[0]->GetObjectFile()->GetType() 
+              == ObjectFile::Type::eTypeExecutable;
+      lldb_private::ObjectFile *obj = module_sp->GetObjectFile();
+      if (!elem_zero_is_executable && obj 
+          && obj->GetType() == ObjectFile::Type::eTypeExecutable) {
+        m_modules.insert(m_modules.begin(), module_sp);
+      } else {
+        m_modules.push_back(module_sp);
+      }
+    }
     if (use_notifier && m_notifier)
       m_notifier->NotifyModuleAdded(*this, module_sp);
   }
diff --git a/lldb/test/API/functionalities/executable_first/Makefile b/lldb/test/API/functionalities/executable_first/Makefile
new file mode 100644
index 000000000000000..74060ad03c65849
--- /dev/null
+++ b/lldb/test/API/functionalities/executable_first/Makefile
@@ -0,0 +1,4 @@
+CXX_SOURCES := main.cpp
+DYLIB_CXX_SOURCES := b.cpp
+DYLIB_NAME := bar
+include Makefile.rules
diff --git a/lldb/test/API/functionalities/executable_first/TestExecutableFirst.py b/lldb/test/API/functionalities/executable_first/TestExecutableFirst.py
new file mode 100644
index 000000000000000..bcf967a596deb87
--- /dev/null
+++ b/lldb/test/API/functionalities/executable_first/TestExecutableFirst.py
@@ -0,0 +1,46 @@
+# This test checks that we make the executable the first
+# element in the image list.
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+
+class TestExecutableIsFirst(TestBase):
+    NO_DEBUG_INFO_TESTCASE = True
+
+    def test_executable_is_first_before_run(self):
+        self.build()
+
+        ctx = self.platformContext
+        lib_name = ctx.shlib_prefix + "bar." + ctx.shlib_extension
+
+        exe = self.getBuildArtifact("a.out")
+        lib = self.getBuildArtifact(lib_name)
+
+        target = self.dbg.CreateTarget(None)
+        module = target.AddModule(lib, None, None)
+        self.assertTrue(module.IsValid(), "Added the module for the library")
+        
+        module = target.AddModule(exe, None, None)
+        self.assertTrue(module.IsValid(), "Added the executable module")
+
+       # This is the executable module so it should be the first in the list:
+        first_module = target.GetModuleAtIndex(0)
+        print("This is the first test, this one succeeds")
+        self.assertEqual(module, first_module, "This executable is the first module")
+
+        # The executable property is an SBFileSpec to the executable.  Make sure
+        # that is also right:
+        executable_module = target.executable
+        self.assertEqual(first_module.file, executable_module, "Python property is also our module")
+
+    def test_executable_is_first_during_run(self):
+        self.build()
+        (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(self, "break after function call", lldb.SBFileSpec("main.cpp"))
+        
+        first_module = target.GetModuleAtIndex(0)
+        self.assertTrue(first_module.IsValid(), "We have at least one module")
+        executable_module = target.executable
+        self.assertEqual(first_module.file, executable_module, "They are the same")
diff --git a/lldb/test/API/functionalities/executable_first/b.cpp b/lldb/test/API/functionalities/executable_first/b.cpp
new file mode 100644
index 000000000000000..911447bee489cef
--- /dev/null
+++ b/lldb/test/API/functionalities/executable_first/b.cpp
@@ -0,0 +1 @@
+int LLDB_DYLIB_EXPORT b_function() { return 500; }
diff --git a/lldb/test/API/functionalities/executable_first/main.cpp b/lldb/test/API/functionalities/executable_first/main.cpp
new file mode 100644
index 000000000000000..ba61bf837b96519
--- /dev/null
+++ b/lldb/test/API/functionalities/executable_first/main.cpp
@@ -0,0 +1,6 @@
+extern int b_function();
+
+int main(int argc, char* argv[]) {
+  int ret_value = b_function();
+  return ret_value; // break after function call
+}

Copy link

github-actions bot commented Jan 16, 2024

✅ With the latest revision this PR passed the Python code formatter.

Copy link

github-actions bot commented Jan 16, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 32dd5b20973bde1ef77fa3b84b9f85788a1a303a bd4f49d12d5ceecd559139fef0b4f2bfaa00164a -- lldb/test/API/functionalities/executable_first/b.cpp lldb/test/API/functionalities/executable_first/main.cpp lldb/source/Core/ModuleList.cpp
View the diff from clang-format here.
diff --git a/lldb/source/Core/ModuleList.cpp b/lldb/source/Core/ModuleList.cpp
index 80299f80a2..a5f8624b11 100644
--- a/lldb/source/Core/ModuleList.cpp
+++ b/lldb/source/Core/ModuleList.cpp
@@ -212,8 +212,8 @@ void ModuleList::AppendImpl(const ModuleSP &module_sp, bool use_notifier) {
   if (module_sp) {
     std::lock_guard<std::recursive_mutex> guard(m_modules_mutex);
     // We are required to keep the first element of the Module List as the
-    // executable module.  So check here and if the first module is NOT an 
-    // but the new one is, we insert this module at the beginning, rather than 
+    // executable module.  So check here and if the first module is NOT an
+    // but the new one is, we insert this module at the beginning, rather than
     // at the end.
     // We don't need to do any of this if the list is empty:
     if (m_modules.empty()) {
@@ -222,13 +222,13 @@ void ModuleList::AppendImpl(const ModuleSP &module_sp, bool use_notifier) {
       // Since producing the ObjectFile may take some work, first check the 0th
       // element, and only if that's NOT an executable look at the incoming
       // ObjectFile.  That way in the normal case we only look at the element
-      // 0 ObjectFile. 
-      const bool elem_zero_is_executable 
-          = m_modules[0]->GetObjectFile()->GetType() 
-              == ObjectFile::Type::eTypeExecutable;
+      // 0 ObjectFile.
+      const bool elem_zero_is_executable =
+          m_modules[0]->GetObjectFile()->GetType() ==
+          ObjectFile::Type::eTypeExecutable;
       lldb_private::ObjectFile *obj = module_sp->GetObjectFile();
-      if (!elem_zero_is_executable && obj 
-          && obj->GetType() == ObjectFile::Type::eTypeExecutable) {
+      if (!elem_zero_is_executable && obj &&
+          obj->GetType() == ObjectFile::Type::eTypeExecutable) {
         m_modules.insert(m_modules.begin(), module_sp);
       } else {
         m_modules.push_back(module_sp);
diff --git a/lldb/test/API/functionalities/executable_first/main.cpp b/lldb/test/API/functionalities/executable_first/main.cpp
index ba61bf837b..e154a01d2d 100644
--- a/lldb/test/API/functionalities/executable_first/main.cpp
+++ b/lldb/test/API/functionalities/executable_first/main.cpp
@@ -1,6 +1,6 @@
 extern int b_function();
 
-int main(int argc, char* argv[]) {
+int main(int argc, char *argv[]) {
   int ret_value = b_function();
   return ret_value; // break after function call
 }

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jimingham jimingham merged commit a4cd99e into llvm:main Jan 17, 2024
@jimingham jimingham deleted the executable_is_first branch January 17, 2024 01:12
jimingham added a commit to jimingham/from-apple-llvm-project that referenced this pull request Jan 17, 2024
We claim in a couple places that the zeroth element of the module list
for a target is the main executable, but we don't actually enforce that
in the ModuleList class. As we saw, for instance, in

32dd5b2

it's not all that hard to get this to be off. This patch ensures that
the first object file of type Executable added to it is moved to the
front of the ModuleList. I also added a test for this.

In the normal course of operation, where the executable is added first,
this only adds a check for whether the first element in the module list
is an executable. If that's true, we just append as normal.

Note, the code in Target::GetExecutableModule doesn't actually agree
that the zeroth element must be the executable, it instead returns the
first Module of type Executable. But I can't tell whether that was a
change in intention or just working around the bug that we don't always
maintain this ordering. But given we've said this in scripting as well
as internally, I think we shouldn't change our minds about this.

(cherry picked from commit a4cd99e)
jimingham added a commit to jimingham/from-apple-llvm-project that referenced this pull request Jan 17, 2024
We claim in a couple places that the zeroth element of the module list
for a target is the main executable, but we don't actually enforce that
in the ModuleList class. As we saw, for instance, in

32dd5b2

it's not all that hard to get this to be off. This patch ensures that
the first object file of type Executable added to it is moved to the
front of the ModuleList. I also added a test for this.

In the normal course of operation, where the executable is added first,
this only adds a check for whether the first element in the module list
is an executable. If that's true, we just append as normal.

Note, the code in Target::GetExecutableModule doesn't actually agree
that the zeroth element must be the executable, it instead returns the
first Module of type Executable. But I can't tell whether that was a
change in intention or just working around the bug that we don't always
maintain this ordering. But given we've said this in scripting as well
as internally, I think we shouldn't change our minds about this.

(cherry picked from commit a4cd99e)
jimingham added a commit to jimingham/from-apple-llvm-project that referenced this pull request Jan 17, 2024
We claim in a couple places that the zeroth element of the module list
for a target is the main executable, but we don't actually enforce that
in the ModuleList class. As we saw, for instance, in

32dd5b2

it's not all that hard to get this to be off. This patch ensures that
the first object file of type Executable added to it is moved to the
front of the ModuleList. I also added a test for this.

In the normal course of operation, where the executable is added first,
this only adds a check for whether the first element in the module list
is an executable. If that's true, we just append as normal.

Note, the code in Target::GetExecutableModule doesn't actually agree
that the zeroth element must be the executable, it instead returns the
first Module of type Executable. But I can't tell whether that was a
change in intention or just working around the bug that we don't always
maintain this ordering. But given we've said this in scripting as well
as internally, I think we shouldn't change our minds about this.

(cherry picked from commit a4cd99e)
@petrhosek
Copy link
Member

lldb-api :: functionalities/executable_first/TestExecutableFirst.py is failing on our Linux builders with the following error:

Script:
--
/b/s/w/ir/x/w/lldb_install/python3/bin/python3 /b/s/w/ir/x/w/llvm-llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env ARCHIVER=/b/s/w/ir/x/w/cipd/bin/llvm-ar --env OBJCOPY=/b/s/w/ir/x/w/cipd/bin/llvm-objcopy --env LLVM_LIBS_DIR=/b/s/w/ir/x/w/llvm_build/./lib --env LLVM_INCLUDE_DIR=/b/s/w/ir/x/w/llvm_build/include --env LLVM_TOOLS_DIR=/b/s/w/ir/x/w/llvm_build/./bin --libcxx-include-dir /b/s/w/ir/x/w/llvm_build/include/c++/v1 --libcxx-include-target-dir /b/s/w/ir/x/w/llvm_build/include/x86_64-unknown-linux-gnu/c++/v1 --libcxx-library-dir /b/s/w/ir/x/w/llvm_build/./lib/x86_64-unknown-linux-gnu --arch x86_64 --build-dir /b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex --lldb-module-cache-dir /b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /b/s/w/ir/x/w/llvm_build/./bin/lldb --compiler /b/s/w/ir/x/w/llvm_build/./bin/clang --dsymutil /b/s/w/ir/x/w/llvm_build/./bin/dsymutil --llvm-tools-dir /b/s/w/ir/x/w/llvm_build/./bin --lldb-libs-dir /b/s/w/ir/x/w/llvm_build/./lib /b/s/w/ir/x/w/llvm-llvm-project/lldb/test/API/functionalities/executable_first -p TestExecutableFirst.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 18.0.0git (https://llvm.googlesource.com/a/llvm-project revision a4cd99ea8736eda2b8b4de34453f55008bcf9c30)
  clang revision a4cd99ea8736eda2b8b4de34453f55008bcf9c30
  llvm revision a4cd99ea8736eda2b8b4de34453f55008bcf9c30
Skipping the following test categories: ['dsym', 'gmodules', 'debugserver', 'objc']
This is the first test, this one succeeds

--
Command Output (stderr):
--
FAIL: LLDB (/b/s/w/ir/x/w/llvm_build/bin/clang-x86_64) :: test_executable_is_first_before_run (TestExecutableFirst.TestExecutableIsFirst)
FAIL: LLDB (/b/s/w/ir/x/w/llvm_build/bin/clang-x86_64) :: test_executable_is_first_during_run (TestExecutableFirst.TestExecutableIsFirst)
======================================================================
FAIL: test_executable_is_first_before_run (TestExecutableFirst.TestExecutableIsFirst)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/b/s/w/ir/x/w/llvm-llvm-project/lldb/test/API/functionalities/executable_first/TestExecutableFirst.py", line 32, in test_executable_is_first_before_run
    self.assertEqual(module, first_module, "This executable is the first module")
AssertionError: (x86_64) /b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex/functionalities/executable_first/TestExecutableFirst.test_executable_is_first_before_run/a.out != (x86_64) /b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex/functionalities/executable_first/TestExecutableFirst.test_executable_is_first_before_run/libbar.so : This executable is the first module
Config=x86_64-/b/s/w/ir/x/w/llvm_build/bin/clang
======================================================================
FAIL: test_executable_is_first_during_run (TestExecutableFirst.TestExecutableIsFirst)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/b/s/w/ir/x/w/llvm-llvm-project/lldb/test/API/functionalities/executable_first/TestExecutableFirst.py", line 43, in test_executable_is_first_during_run
    (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/b/s/w/ir/x/w/llvm-llvm-project/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 1009, in run_to_source_breakpoint
    return run_to_breakpoint_do_run(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/b/s/w/ir/x/w/llvm-llvm-project/lldb/packages/Python/lldbsuite/test/lldbutil.py", line 906, in run_to_breakpoint_do_run
    test.fail(
AssertionError: Test process is not stopped at breakpoint: state: exited, exit code: 127, stdout: '/b/s/w/ir/x/w/llvm_build/lldb-test-build.noindex/functionalities/executable_first/TestExecutableFirst.test_executable_is_first_during_run/a.out: error while loading shared libraries: libbar.so: cannot open shared object file: No such file or directory
'
Config=x86_64-/b/s/w/ir/x/w/llvm_build/bin/clang
----------------------------------------------------------------------
Ran 2 tests in 0.298s

RESULT: FAILED (0 passes, 2 failures, 0 errors, 0 skipped, 0 expected failures, 0 unexpected successes)

--

Do you know what might be the issue?

@labath
Copy link
Collaborator

labath commented Jan 17, 2024

The problem is that linux/elf does not really have hard line between shared libraries, (position-independent) executables and the dynamic linker. they all have e_type = ET_DYN in their header. It is possible to create a shared library that can also be executed (unless one is extremely careful it will crash very quickly, but that crash may exactly be something that one may want to observe in a debugger). It's probably also possible to create a library that will also serve as the dynamic loader.

So, the "is this module an executable" question is not really well defined. This isn't an intrinsic property of the module, but rather a role that the module plays in a specific process. Once a process starts, this can be (and is) determined by the dynamic linker plugin. Before that, we need to rely on some external signal for that (if we need to know that at all).

With that in mind, I'm going to disable the before-run part of the test for elf.

@jimingham
Copy link
Collaborator Author

jimingham commented Jan 17, 2024 via email

@jimingham
Copy link
Collaborator Author

jimingham commented Jan 17, 2024 via email

@jimingham
Copy link
Collaborator Author

jimingham commented Jan 17, 2024 via email

@labath
Copy link
Collaborator

labath commented Jan 18, 2024

Will the value of ObjectFile::GetType ever be meaningful, then?

It's partially meaningful. eTypeCoreFile, eTypeObjectFile and eTypeExecutable (when it is set) should mean what they say. The problem is with eTypeSharedLibrary (and to a lesser degree with eTypeDebugInfo). Just because an object is a shared library doesn't mean that it can be executed (as the main executable or the dynamic linker). It's not that these concepts don't exist, it's just that it's not possible to determine the role of a file in isolation. I don't think that was anyone's intention, it's just ELF (unlike, say, MachO) doesn't have the flags to differentiate these.

Do you change it after running based on what the loader tells you?

We don't, and I don't think we should, because that's about a role that the object plays in a specific process. In theory you could use that object in a different role in another process.

So long as ObjectFileELF::CalculateType never returns ET_EXEC, this won't be an issue. But if some ObjectFile that's not in fact the main binary says it has this type, not only will this change to Append not do the right thing, Target::GetExecutableModule will also return this module.

CalculateType() will return eTypeExecutable for ET_EXEC, which is used for non-position-independent executables, and I don't think you can load those as libraries (the dynamic loader will not let them. However, there is nothing stopping a user from adding that module manually (exactly like your test does) and that could mess things up a bit. Not too much, because (outside of Swift, I guess) we don't care that much about which module is the main executable, but it will confuse anyone who asks for it. To be honest, I don't particularly like this functionality of automatically shuffling module lists. I think it would be better to have separate API to set the executable module or tag an existing module as such. It's probably the only thing that can reliably work for elf, but I also think its clearer/more explicit in general.

@jimingham
Copy link
Collaborator Author

jimingham commented Jan 18, 2024 via email

JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Jan 25, 2024
Ensure that the executable module is ModuleList[0] (llvm#78360)
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
We claim in a couple places that the zeroth element of the module list
for a target is the main executable, but we don't actually enforce that
in the ModuleList class. As we saw, for instance, in

32dd5b2

it's not all that hard to get this to be off. This patch ensures that
the first object file of type Executable added to it is moved to the
front of the ModuleList. I also added a test for this.

In the normal course of operation, where the executable is added first,
this only adds a check for whether the first element in the module list
is an executable. If that's true, we just append as normal.

Note, the code in Target::GetExecutableModule doesn't actually agree
that the zeroth element must be the executable, it instead returns the
first Module of type Executable. But I can't tell whether that was a
change in intention or just working around the bug that we don't always
maintain this ordering. But given we've said this in scripting as well
as internally, I think we shouldn't change our minds about this.
jimingham added a commit to jimingham/from-apple-llvm-project that referenced this pull request Feb 21, 2024
We claim in a couple places that the zeroth element of the module list
for a target is the main executable, but we don't actually enforce that
in the ModuleList class. As we saw, for instance, in

32dd5b2

it's not all that hard to get this to be off. This patch ensures that
the first object file of type Executable added to it is moved to the
front of the ModuleList. I also added a test for this.

In the normal course of operation, where the executable is added first,
this only adds a check for whether the first element in the module list
is an executable. If that's true, we just append as normal.

Note, the code in Target::GetExecutableModule doesn't actually agree
that the zeroth element must be the executable, it instead returns the
first Module of type Executable. But I can't tell whether that was a
change in intention or just working around the bug that we don't always
maintain this ordering. But given we've said this in scripting as well
as internally, I think we shouldn't change our minds about this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants