Skip to content

[lldb] Add count for number of DWO files loaded in statistics #144424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 23, 2025

Conversation

qxy11
Copy link
Contributor

@qxy11 qxy11 commented Jun 16, 2025

Summary

A new totalLoadedDwoFileCount and totalDwoFileCount counters to available statisctics when calling "statistics dump".

  1. GetDwoFileCounts is created, and returns a pair of ints representing the number of loaded DWO files and the total number of DWO files, respectively. An override is implemented for SymbolFileDWARF that loops through each compile unit, and adds to a counter if it's a DWO unit, and then uses GetDwoSymbolFile(false) to check whether the DWO file was already loaded/parsed.

  2. In Statistics, use GetSeparateDebugInfo to sum up the total number of loaded/parsed DWO files along with the total number of DWO files. This is done by checking whether the DWO file was already successfully loaded in the collected DWO data, anding adding to the totalLoadedDwoFileCount, and adding to totalDwoFileCount for all CU units.

Expected Behavior

  • When binaries are compiled with split-dwarf and separate DWO files, totalLoadedDwoFileCount would be the number of loaded DWO files and totalDwoFileCount would be the total count of DWO files.
  • When using a DWP file instead of separate DWO files, totalLoadedDwoFileCount would be the number of parsed compile units, while totalDwoFileCount would be the total number of CUs in the DWP file. This should be similar to the counts we get from loading separate DWO files rather than only counting whether a single DWP file was loaded.
  • When not using split-dwarf, we expect both totalDwoFileCount and totalLoadedDwoFileCount to be 0 since no separate debug info is loaded.

Testing

Manual Testing
On an internal script that has many DWO files, statistics dump was called before and after a type lookup command. The totalLoadedDwoFileCount increased as expected after the type lookup.

(lldb) statistics dump
{
  ...
  "totalLoadedDwoFileCount": 29,
}
(lldb) type lookup folly::Optional<unsigned int>::Storage
typedef std::conditional<true, folly::Optional<unsigned int>::StorageTriviallyDestructible, folly::Optional<unsigned int>::StorageNonTriviallyDestructible>::type
typedef std::conditional<true, folly::Optional<unsigned int>::StorageTriviallyDestructible, folly::Optional<unsigned int>::StorageNonTriviallyDestructible>::type
...
(lldb) statistics dump
{
  ...
  "totalLoadedDwoFileCount": 2160,
}

Unit test
Added three unit tests that build with new "third.cpp" and "baz.cpp" files. For tests with w/ flags -gsplit-dwarf -gpubnames, this generates 2 DWO files. Then, the test incrementally adds breakpoints, and does a type lookup, and the count should increase for each of these as new DWO files get loaded to support these.

$ bin/lldb-dotest -p TestStats.py ~/llvm-sand/external/llvm-project/lldb/test/API/commands/statistics/basic/
----------------------------------------------------------------------
Ran 20 tests in 211.738s

OK (skipped=3)

@qxy11 qxy11 requested a review from JDevlieghere as a code owner June 16, 2025 19:49
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added the lldb label Jun 16, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 16, 2025

@llvm/pr-subscribers-lldb

Author: None (qxy11)

Changes

Summary

A new totalLoadedDwoFileCount is added to available statisctics when calling "statistics dump".

  1. The counter in SymbolFileDWARF m_loaded_dwo_file_count is added, and accessed through the GetLoadedDwoFileCount()
  2. m_loaded_dwo_file_count is incremented through a new protected function IncrementLoadedDwoFileCount(), which is called in the constructor of SymbolFileDWARFDwo (as opposed to directly incrementing the counter in SymbolFileDwarf to prevent inconsistent updates to the counter in the future).
  3. The GetLoadedDwoFileCount() API is added to SymbolFile.h for access to the DWO count from the symbol files loaded in Statistics, which returns 0 by default for all symbol file types except for SymbolFileDWARF.

Testing

Manual Testing
On an internal script that has many DWO files, statistics dump was called before and after a type lookup command. The totalLoadedDwoFileCount increased as expected after the type lookup.

(lldb) statistics dump
{
  ...
  "totalLoadedDwoFileCount": 29,
}
(lldb) type lookup folly::Optional&lt;unsigned int&gt;::Storage
typedef std::conditional&lt;true, folly::Optional&lt;unsigned int&gt;::StorageTriviallyDestructible, folly::Optional&lt;unsigned int&gt;::StorageNonTriviallyDestructible&gt;::type
typedef std::conditional&lt;true, folly::Optional&lt;unsigned int&gt;::StorageTriviallyDestructible, folly::Optional&lt;unsigned int&gt;::StorageNonTriviallyDestructible&gt;::type
...
(lldb) statistics dump
{
  ...
  "totalLoadedDwoFileCount": 2160,
}

Unit test
Added a unit test that builds new "third.cpp" and "baz.cpp" files w/ flags -gsplit-dwarf -gpubnames. This generated 2 DWO files. Then, the test incrementally adds breakpoints, and does a type lookup, and the count should increase for each of these as new DWO files get loaded to support these.

$ bin/lldb-dotest -p TestStats.py ~/llvm-sand/external/llvm-project/lldb/test/API/commands/statistics/basic/
----------------------------------------------------------------------
Ran 20 tests in 211.738s

OK (skipped=3)

Full diff: https://github.com/llvm/llvm-project/pull/144424.diff

10 Files Affected:

  • (modified) lldb/include/lldb/Symbol/SymbolFile.h (+5)
  • (modified) lldb/packages/Python/lldbsuite/test/builders/builder.py (+18-7)
  • (modified) lldb/packages/Python/lldbsuite/test/make/Makefile.rules (+4)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h (+9)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp (+1-1)
  • (modified) lldb/source/Target/Statistics.cpp (+5)
  • (modified) lldb/test/API/commands/statistics/basic/TestStats.py (+31)
  • (added) lldb/test/API/commands/statistics/basic/baz.cpp (+12)
  • (added) lldb/test/API/commands/statistics/basic/third.cpp (+7)
diff --git a/lldb/include/lldb/Symbol/SymbolFile.h b/lldb/include/lldb/Symbol/SymbolFile.h
index 75c7f230ddf3d..42fbc2508889a 100644
--- a/lldb/include/lldb/Symbol/SymbolFile.h
+++ b/lldb/include/lldb/Symbol/SymbolFile.h
@@ -425,6 +425,11 @@ class SymbolFile : public PluginInterface {
   /// Reset the statistics for the symbol file.
   virtual void ResetStatistics() {}
 
+  /// Get the number of loaded DWO files for this symbol file.
+  /// This is used for statistics gathering and will return 0 for most
+  /// symbol file implementations except DWARF symbol files.
+  virtual uint32_t GetLoadedDwoFileCount() const { return 0; }
+
   /// Get the additional modules that this symbol file uses to parse debug info.
   ///
   /// Some debug info is stored in stand alone object files that are represented
diff --git a/lldb/packages/Python/lldbsuite/test/builders/builder.py b/lldb/packages/Python/lldbsuite/test/builders/builder.py
index de05732469448..572abf590345c 100644
--- a/lldb/packages/Python/lldbsuite/test/builders/builder.py
+++ b/lldb/packages/Python/lldbsuite/test/builders/builder.py
@@ -247,13 +247,24 @@ def getLLDBObjRoot(self):
     def _getDebugInfoArgs(self, debug_info):
         if debug_info is None:
             return []
-        if debug_info == "dwarf":
-            return ["MAKE_DSYM=NO"]
-        if debug_info == "dwo":
-            return ["MAKE_DSYM=NO", "MAKE_DWO=YES"]
-        if debug_info == "gmodules":
-            return ["MAKE_DSYM=NO", "MAKE_GMODULES=YES"]
-        return None
+        
+        debug_options = debug_info if isinstance(debug_info, list) else [debug_info]
+        option_flags = {
+            "dwarf": {"MAKE_DSYM": "NO"},
+            "dwo": {"MAKE_DSYM": "NO", "MAKE_DWO": "YES"},
+            "gmodules": {"MAKE_DSYM": "NO", "MAKE_GMODULES": "YES"},
+            "debug_names": {"MAKE_DEBUG_NAMES": "YES"},
+        }
+        
+        # Collect all flags, with later options overriding earlier ones
+        flags = {}
+        
+        for option in debug_options:
+            if not option or option not in option_flags:
+                return None # Invalid options
+            flags.update(option_flags[option])
+        
+        return [f"{key}={value}" for key, value in flags.items()]
 
     def getBuildCommand(
         self,
diff --git a/lldb/packages/Python/lldbsuite/test/make/Makefile.rules b/lldb/packages/Python/lldbsuite/test/make/Makefile.rules
index 06959f226066a..58833e1b0cc78 100644
--- a/lldb/packages/Python/lldbsuite/test/make/Makefile.rules
+++ b/lldb/packages/Python/lldbsuite/test/make/Makefile.rules
@@ -276,6 +276,10 @@ ifeq "$(MAKE_DWO)" "YES"
 	CFLAGS += -gsplit-dwarf
 endif
 
+ifeq "$(MAKE_DEBUG_NAMES)" "YES"
+	CFLAGS += -gpubnames
+endif
+
 ifeq "$(USE_PRIVATE_MODULE_CACHE)" "YES"
 THE_CLANG_MODULE_CACHE_DIR := $(BUILDDIR)/private-module-cache
 else
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 71f204c03a42a..acd9d2230c201 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -4358,6 +4358,7 @@ StatsDuration::Duration SymbolFileDWARF::GetDebugInfoIndexTime() {
 
 void SymbolFileDWARF::ResetStatistics() {
   m_parse_time.reset();
+  m_loaded_dwo_file_count = 0;
   if (m_index)
     return m_index->ResetStatistics();
 }
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
index d2d30d7decb16..d695961bbfc1d 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
@@ -315,6 +315,11 @@ class SymbolFileDWARF : public SymbolFileCommon {
 
   void ResetStatistics() override;
 
+  /// Get the number of loaded DWO files for this symbol file
+  uint32_t GetLoadedDwoFileCount() const override {
+    return m_loaded_dwo_file_count;
+  }
+
   virtual lldb::offset_t
   GetVendorDWARFOpcodeSize(const DataExtractor &data,
                            const lldb::offset_t data_offset,
@@ -497,6 +502,8 @@ class SymbolFileDWARF : public SymbolFileCommon {
 
   void InitializeFirstCodeAddress();
 
+  void IncrementLoadedDwoFileCount() { m_loaded_dwo_file_count++; }
+
   void
   GetCompileOptions(std::unordered_map<lldb::CompUnitSP, Args> &args) override;
 
@@ -550,6 +557,8 @@ class SymbolFileDWARF : public SymbolFileCommon {
   /// valid value that can be used in DIERef objects which will contain
   /// an index that identifies the .DWO or .o file.
   std::optional<uint64_t> m_file_index;
+  /// Count of loaded DWO files for this symbol file
+  uint32_t m_loaded_dwo_file_count = 0;
 };
 } // namespace dwarf
 } // namespace lldb_private::plugin
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
index c1829abe72933..1f0e847da2905 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
@@ -31,7 +31,7 @@ SymbolFileDWARFDwo::SymbolFileDWARFDwo(SymbolFileDWARF &base_symbol_file,
                                    /*update_module_section_list*/ false)),
       m_base_symbol_file(base_symbol_file) {
   SetFileIndex(id);
-
+  m_base_symbol_file.IncrementLoadedDwoFileCount();
   // Parsing of the dwarf unit index is not thread-safe, so we need to prime it
   // to enable subsequent concurrent lookups.
   m_context.GetAsLLVM().getCUIndex();
diff --git a/lldb/source/Target/Statistics.cpp b/lldb/source/Target/Statistics.cpp
index 6ec8f8963baf9..b30d7f755f2fc 100644
--- a/lldb/source/Target/Statistics.cpp
+++ b/lldb/source/Target/Statistics.cpp
@@ -322,6 +322,7 @@ llvm::json::Value DebuggerStats::ReportStatistics(
   uint32_t num_modules_with_incomplete_types = 0;
   uint32_t num_stripped_modules = 0;
   uint32_t symtab_symbol_count = 0;
+  uint32_t total_loaded_dwo_file_count = 0;
   for (size_t image_idx = 0; image_idx < num_modules; ++image_idx) {
     Module *module = target != nullptr
                          ? target->GetImages().GetModuleAtIndex(image_idx).get()
@@ -353,6 +354,9 @@ llvm::json::Value DebuggerStats::ReportStatistics(
         for (const auto &symbol_module : symbol_modules.Modules())
           module_stat.symfile_modules.push_back((intptr_t)symbol_module.get());
       }
+      // Add DWO file count from this symbol file (will be 0 for non-DWARF
+      // symbol files)
+      total_loaded_dwo_file_count += sym_file->GetLoadedDwoFileCount();
       module_stat.debug_info_index_loaded_from_cache =
           sym_file->GetDebugInfoIndexWasLoadedFromCache();
       if (module_stat.debug_info_index_loaded_from_cache)
@@ -427,6 +431,7 @@ llvm::json::Value DebuggerStats::ReportStatistics(
       {"totalDebugInfoEnabled", num_debug_info_enabled_modules},
       {"totalSymbolTableStripped", num_stripped_modules},
       {"totalSymbolTableSymbolCount", symtab_symbol_count},
+      {"totalLoadedDwoFileCount", total_loaded_dwo_file_count},
   };
 
   if (include_targets) {
diff --git a/lldb/test/API/commands/statistics/basic/TestStats.py b/lldb/test/API/commands/statistics/basic/TestStats.py
index 83132b40d85db..de4dafaf0a77d 100644
--- a/lldb/test/API/commands/statistics/basic/TestStats.py
+++ b/lldb/test/API/commands/statistics/basic/TestStats.py
@@ -177,6 +177,7 @@ def test_default_no_run(self):
             "totalDebugInfoIndexLoadedFromCache",
             "totalDebugInfoIndexSavedToCache",
             "totalDebugInfoParseTime",
+            "totalLoadedDwoFileCount",
         ]
         self.verify_keys(debug_stats, '"debug_stats"', debug_stat_keys, None)
         if self.getPlatform() != "windows":
@@ -287,6 +288,7 @@ def test_default_with_run(self):
             "totalDebugInfoIndexLoadedFromCache",
             "totalDebugInfoIndexSavedToCache",
             "totalDebugInfoParseTime",
+            "totalLoadedDwoFileCount",
         ]
         self.verify_keys(debug_stats, '"debug_stats"', debug_stat_keys, None)
         stats = debug_stats["targets"][0]
@@ -325,6 +327,7 @@ def test_memory(self):
             "totalDebugInfoIndexLoadedFromCache",
             "totalDebugInfoIndexSavedToCache",
             "totalDebugInfoByteSize",
+            "totalLoadedDwoFileCount",
         ]
         self.verify_keys(debug_stats, '"debug_stats"', debug_stat_keys, None)
 
@@ -377,6 +380,7 @@ def test_modules(self):
             "totalDebugInfoIndexLoadedFromCache",
             "totalDebugInfoIndexSavedToCache",
             "totalDebugInfoByteSize",
+            "totalLoadedDwoFileCount",
         ]
         self.verify_keys(debug_stats, '"debug_stats"', debug_stat_keys, None)
         stats = debug_stats["targets"][0]
@@ -485,6 +489,7 @@ def test_breakpoints(self):
             "totalDebugInfoIndexLoadedFromCache",
             "totalDebugInfoIndexSavedToCache",
             "totalDebugInfoByteSize",
+            "totalLoadedDwoFileCount",
         ]
         self.verify_keys(debug_stats, '"debug_stats"', debug_stat_keys, None)
         target_stats = debug_stats["targets"][0]
@@ -512,6 +517,32 @@ def test_breakpoints(self):
             self.verify_keys(
                 breakpoint, 'target_stats["breakpoints"]', bp_keys_exist, None
             )
+    
+    def test_loaded_dwo_file_count(self):
+        """
+        Test "statistics dump" and the loaded dwo file count.
+        Builds a binary w/ separate .dwo files and debug_names, and then
+        verifies the loaded dwo file count is the expected count after running
+        various commands
+        """
+        da = {"CXX_SOURCES": "third.cpp baz.cpp", "EXE": self.getBuildArtifact("a.out")}
+        self.build(dictionary=da, debug_info=["dwo", "debug_names"])
+        self.addTearDownCleanup(dictionary=da)
+        exe = self.getBuildArtifact("a.out")
+        target = self.createTestTarget(file_path=exe)
+        debug_stats = self.get_stats()
+
+        self.assertIn("totalLoadedDwoFileCount", debug_stats)
+        self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 0)
+        
+        self.runCmd("b main")
+        debug_stats_after_main = self.get_stats()
+        self.assertEqual(debug_stats_after_main["totalLoadedDwoFileCount"], 1)
+        
+        self.runCmd("type lookup Baz")
+        debug_stats_after_type_lookup = self.get_stats()
+        self.assertEqual(debug_stats_after_type_lookup["totalLoadedDwoFileCount"], 2)
+        
 
     @skipUnlessDarwin
     @no_debug_info_test
diff --git a/lldb/test/API/commands/statistics/basic/baz.cpp b/lldb/test/API/commands/statistics/basic/baz.cpp
new file mode 100644
index 0000000000000..536758b17d839
--- /dev/null
+++ b/lldb/test/API/commands/statistics/basic/baz.cpp
@@ -0,0 +1,12 @@
+// Helper that the lldb command `statistics dump` works in split-dwarf mode.
+
+struct Baz {
+  int x;
+  bool y;
+};
+
+void baz() {
+  Baz b;
+  b.x = 1;
+  b.y = true;
+}
diff --git a/lldb/test/API/commands/statistics/basic/third.cpp b/lldb/test/API/commands/statistics/basic/third.cpp
new file mode 100644
index 0000000000000..3943b9c2faafe
--- /dev/null
+++ b/lldb/test/API/commands/statistics/basic/third.cpp
@@ -0,0 +1,7 @@
+// Test that the lldb command `statistics dump` works.
+
+void baz();
+int main(void) {
+  baz();
+  return 0;
+}

@qxy11
Copy link
Contributor Author

qxy11 commented Jun 16, 2025

cc @dmpots @jeffreytan81 @clayborg

/// Get the number of loaded DWO files for this symbol file.
/// This is used for statistics gathering and will return 0 for most
/// symbol file implementations except DWARF symbol files.
virtual uint32_t GetLoadedDwoFileCount() const { return 0; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a GetSeparateDebugInfo member function that returns info about separate debug info (which dwo is one type). Looking here it looks like we could use it to find all the dwos that were loaded.

I wonder if it would be better to use this existing function when generating the stats instead of adding a new overload here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference! In GetSeparateDebugInfo here, the call to GetDwoSymbolFile ends up loading all DWO files (even if they weren't already loaded). I think it'd work if we set the load_all_debug_info to false in the call and do GetDwoSymbolFile(false), in which case we'd have to add an extra load_all_debug_info argument to GetSeparateDebugInfo as well so that other existing use cases for GetSeparateDebugInfo don't get effected. I can try that instead of adding the new overload.

@jeffreytan81
Copy link
Contributor

Besides totalLoadedDwoFileCount, we need another totalDwoFileCount. The ratio between them will show us how much X% of dwo files have been loaded for the debug session.

Also, we should test this against dwp file situation, and non-split-dwarf situation, what are the numbers we should report?

Address comments to use `GetSeparateDebugInfo` instead of adding a new
Symbol File function. Also add a totalLoadedDwoFileCount to statistics
dump and add some unit tests for expected behavior in the DWP case  and
non-split-dwarf cases.
…es_eager_loads_dwo_files

Adds an extra unit test and refactors some logic into a helper function.
Also added additional comments in code and tests to clarify purpose of
debug_names usage and how it prevents the eager loading of DWO files
Copy link
Contributor

@dmpots dmpots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! You should get approvals from @jeffreytan81 and @clayborg as well before merging.

StructuredData::Dictionary separate_debug_info;
if (sym_file->GetSeparateDebugInfo(separate_debug_info,
/*errors_only=*/false,
/*load_all_debug_info*/ false)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*load_all_debug_info=*/false

Comment on lines 389 to 390
UpdateDwoFileCounts(sym_file, total_dwo_file_count,
total_loaded_dwo_file_count);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably break this down and add statistics for each module using keys named "loadedDwoFileCount" and "dwoFileCount" in each module's statistics. And then we still add totals at the end as you have. So maybe fetch into module_stat first and then update the totals:

UpdateDwoFileCounts(sym_file, module_stat.total_dwo_file_count,
                          module_stat.loaded_dwo_file_count);
total_dwo_file_count += module_stat.total_dwo_file_count;
total_loaded_dwo_file_count += module_stat.loaded_dwo_file_count;

We will need to add two things to ModuleStats:

uint32_t dwo_file_count = 0;
uint32_t loaded_dwo_file_count = 0;

@qxy11 qxy11 requested a review from clayborg June 18, 2025 16:21
// For DWP files, this increments counts for both total and successfully
// loaded DWO CUs. For non split-dwarf files, these counts should not change
StructuredData::Dictionary separate_debug_info;
if (sym_file->GetSeparateDebugInfo(separate_debug_info,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not recommend calling GetSeparateDebugInfo API for this task. GetSeparateDebugInfo API constructs StructuredData::Dictionary JSON style objects internally which is not cheap (I have seen profile traces that indicate constructing StructuredData::Dictionary as hot paths). This can get worse considering we are using statistics dump in lldb session logging by default. (means we are paying the cost every time).

If we want to reuse it, I would suggest refactoring and reuse the necessary underlying implementation without extra expensive burdens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll make a separate API for getting the counts here without constructing StructuredData::Dictionary.

total_dwo_file_count++;

bool loaded = false;
if (dict->GetValueForKeyAsBoolean("loaded", loaded) && loaded)
Copy link
Contributor

@jeffreytan81 jeffreytan81 Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is correct for dwp. In dwp scenario, there is one dwp file for the entire module (instead of one dwo for each compile unit), so only one SymbolFileDWARFDwo will be created. Then, if you check the code here https://github.com/llvm/llvm-project/blob/main/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp#L4185-L4194, it will mark that dwo as "loaded if SymbolFileDWARFDwo exists, which will happen for all dwo files during dwp scenarios which can be wrong -- if you have .debug_names, lldb won't need to parse all debug info during startup but parse/load lazily so only partial of dwp are parsed while the current implementation reports all parsed/loaded. This is probably a bug in original SymbolFileDWARF::GetSeparateDebugInfo implementation though, cc @zhyty

Copy link
Contributor Author

@qxy11 qxy11 Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work for DWP and correctly report the # of parsed CUs (see the unit test added test_dwp_dwo_file_count here).

My understanding is that each CU has its own m_dwo shared_ptr object rather than there being a global DWP reference that's shared. It looks like the DWP file first gets loaded here, the specific DWO unit gets indexed here, and then only after the specific DWO data for the CU is parsed does m_dwo get set. So in the code for GetSeparateDebugInfo it should get the correct per-unit counts.

@@ -282,6 +282,11 @@ class SymbolFileDWARF : public SymbolFileCommon {
bool GetSeparateDebugInfo(StructuredData::Dictionary &d,
bool errors_only) override;

// Gets a pair of loaded and total dwo file counts.
// For DWP files, this reports the counts for successfully loaded DWO CUs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"For DWP files", do you mean "For split dwarf files"? Otherwise, you need to mention "dwo files" in addition to "dwp files".

@jeffreytan81 jeffreytan81 merged commit 3095d3a into llvm:main Jun 23, 2025
4 of 7 checks passed
Copy link

@qxy11 Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

Copy link

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r HEAD~1...HEAD lldb/packages/Python/lldbsuite/test/builders/builder.py lldb/test/API/commands/statistics/basic/TestStats.py
View the diff from darker here.
--- packages/Python/lldbsuite/test/builders/builder.py	2025-06-18 23:14:35.000000 +0000
+++ packages/Python/lldbsuite/test/builders/builder.py	2025-06-23 18:53:00.379948 +0000
@@ -245,28 +245,28 @@
         return ["LLDB_OBJ_ROOT={}".format(configuration.lldb_obj_root)]
 
     def _getDebugInfoArgs(self, debug_info):
         if debug_info is None:
             return []
-        
+
         debug_options = debug_info if isinstance(debug_info, list) else [debug_info]
         option_flags = {
             "dwarf": {"MAKE_DSYM": "NO"},
             "dwo": {"MAKE_DSYM": "NO", "MAKE_DWO": "YES"},
             "gmodules": {"MAKE_DSYM": "NO", "MAKE_GMODULES": "YES"},
             "debug_names": {"MAKE_DEBUG_NAMES": "YES"},
             "dwp": {"MAKE_DSYM": "NO", "MAKE_DWP": "YES"},
         }
-        
+
         # Collect all flags, with later options overriding earlier ones
         flags = {}
-        
+
         for option in debug_options:
             if not option or option not in option_flags:
-                return None # Invalid options
+                return None  # Invalid options
             flags.update(option_flags[option])
-        
+
         return [f"{key}={value}" for key, value in flags.items()]
 
     def getBuildCommand(
         self,
         debug_info,
--- test/API/commands/statistics/basic/TestStats.py	2025-06-18 23:14:35.000000 +0000
+++ test/API/commands/statistics/basic/TestStats.py	2025-06-23 18:53:00.691815 +0000
@@ -522,10 +522,11 @@
         ]
         for breakpoint in breakpoints:
             self.verify_keys(
                 breakpoint, 'target_stats["breakpoints"]', bp_keys_exist, None
             )
+
     def test_non_split_dwarf_has_no_dwo_files(self):
         """
         Test "statistics dump" and the dwo file count.
         Builds a binary without split-dwarf mode, and then
         verifies the dwo file count is zero after running "statistics dump"
@@ -536,34 +537,34 @@
         exe = self.getBuildArtifact("a.out")
         target = self.createTestTarget(file_path=exe)
         debug_stats = self.get_stats()
         self.assertIn("totalDwoFileCount", debug_stats)
         self.assertIn("totalLoadedDwoFileCount", debug_stats)
-        
+
         # Verify that the dwo file count is zero
         self.assertEqual(debug_stats["totalDwoFileCount"], 0)
         self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 0)
-    
+
     def test_no_debug_names_eager_loads_dwo_files(self):
         """
         Test the eager loading behavior of DWO files when debug_names is absent by
         building a split-dwarf binary without debug_names and then running "statistics dump".
         DWO file loading behavior:
         - With debug_names: DebugNamesDWARFIndex allows for lazy loading.
           DWO files are loaded on-demand when symbols are actually looked up
         - Without debug_names: ManualDWARFIndex uses eager loading.
-          All DWO files are loaded upfront during the first symbol lookup to build a manual index. 
+          All DWO files are loaded upfront during the first symbol lookup to build a manual index.
         """
         da = {"CXX_SOURCES": "third.cpp baz.cpp", "EXE": self.getBuildArtifact("a.out")}
         self.build(dictionary=da, debug_info=["dwo"])
         self.addTearDownCleanup(dictionary=da)
         exe = self.getBuildArtifact("a.out")
         target = self.createTestTarget(file_path=exe)
         debug_stats = self.get_stats()
         self.assertIn("totalDwoFileCount", debug_stats)
         self.assertIn("totalLoadedDwoFileCount", debug_stats)
-        
+
         # Verify that all DWO files are loaded
         self.assertEqual(debug_stats["totalDwoFileCount"], 2)
         self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 2)
 
     def test_split_dwarf_dwo_file_count(self):
@@ -588,43 +589,43 @@
         self.assertEqual(len(debug_stats["modules"]), 1)
         self.assertIn("totalLoadedDwoFileCount", debug_stats)
         self.assertIn("totalDwoFileCount", debug_stats)
         self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 0)
         self.assertEqual(debug_stats["totalDwoFileCount"], 2)
-        
+
         # Since there's only one module, module stats should have the same counts as total counts
         self.assertIn("dwoFileCount", debug_stats["modules"][0])
         self.assertIn("loadedDwoFileCount", debug_stats["modules"][0])
         self.assertEqual(debug_stats["modules"][0]["loadedDwoFileCount"], 0)
         self.assertEqual(debug_stats["modules"][0]["dwoFileCount"], 2)
-        
+
         # 2) Setting breakpoint in main triggers loading of third.dwo (contains main function)
         self.runCmd("b main")
         debug_stats = self.get_stats()
         self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 1)
         self.assertEqual(debug_stats["totalDwoFileCount"], 2)
 
         self.assertEqual(debug_stats["modules"][0]["loadedDwoFileCount"], 1)
         self.assertEqual(debug_stats["modules"][0]["dwoFileCount"], 2)
-        
+
         # 3) Type lookup forces loading of baz.dwo (contains struct Baz definition)
         self.runCmd("type lookup Baz")
         debug_stats = self.get_stats()
         self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 2)
         self.assertEqual(debug_stats["totalDwoFileCount"], 2)
 
         self.assertEqual(debug_stats["modules"][0]["loadedDwoFileCount"], 2)
         self.assertEqual(debug_stats["modules"][0]["dwoFileCount"], 2)
-    
+
     def test_dwp_dwo_file_count(self):
         """
         Test "statistics dump" and the loaded dwo file count.
         Builds a binary w/ a separate .dwp file and debug_names, and then
         verifies the loaded dwo file count is the expected count after running
         various commands.
 
-        We expect the DWO file counters to reflect the number of compile units 
+        We expect the DWO file counters to reflect the number of compile units
         loaded from the DWP file (each representing what was originally a separate DWO file)
         """
         da = {"CXX_SOURCES": "third.cpp baz.cpp", "EXE": self.getBuildArtifact("a.out")}
         self.build(dictionary=da, debug_info=["dwp", "debug_names"])
         self.addTearDownCleanup(dictionary=da)
@@ -635,23 +636,22 @@
         # Initially: 2 DWO files available but none loaded yet
         self.assertIn("totalLoadedDwoFileCount", debug_stats)
         self.assertIn("totalDwoFileCount", debug_stats)
         self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 0)
         self.assertEqual(debug_stats["totalDwoFileCount"], 2)
-        
+
         # Setting breakpoint in main triggers parsing of the CU within a.dwp corresponding to third.dwo (contains main function)
         self.runCmd("b main")
         debug_stats = self.get_stats()
         self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 1)
         self.assertEqual(debug_stats["totalDwoFileCount"], 2)
-        
+
         # Type lookup forces parsing of the CU within a.dwp corresponding to baz.dwo (contains struct Baz definition)
         self.runCmd("type lookup Baz")
         debug_stats = self.get_stats()
         self.assertEqual(debug_stats["totalDwoFileCount"], 2)
         self.assertEqual(debug_stats["totalLoadedDwoFileCount"], 2)
-        
 
     @skipUnlessDarwin
     @no_debug_info_test
     def test_dsym_binary_has_symfile_in_stats(self):
         """

@vvereschaka
Copy link
Contributor

Hi @jeffreytan81,

looks like TestStats.py test is not working properly with these changes
https://lab.llvm.org/staging/#/builders/211/builds/565

would you take care of it?

@jeffreytan81
Copy link
Contributor

@vvereschaka, are you only seeing TestStats.py timeout on Windows or Linux as well? cc @qxy11 to follow-up.

@Michael137
Copy link
Member

Reverting for now to unblock macOS CI:

10:20:36  FAIL: test_dwp_dwo_file_count (TestStats.TestCase)
10:20:36      Test "statistics dump" and the loaded dwo file count.
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 639, in test_dwp_dwo_file_count
10:20:36      self.assertEqual(debug_stats["totalDwoFileCount"], 2)
10:20:36  AssertionError: 0 != 2
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
10:20:36  ======================================================================
10:20:36  FAIL: test_no_debug_names_eager_loads_dwo_files (TestStats.TestCase)
10:20:36      Test the eager loading behavior of DWO files when debug_names is absent by
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 566, in test_no_debug_names_eager_loads_dwo_files
10:20:36      self.assertEqual(debug_stats["totalDwoFileCount"], 2)
10:20:36  AssertionError: 0 != 2
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
10:20:36  ======================================================================
10:20:36  FAIL: test_split_dwarf_dwo_file_count (TestStats.TestCase)
10:20:36      Test "statistics dump" and the dwo file count.
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 588, in test_split_dwarf_dwo_file_count
10:20:36      self.assertEqual(len(debug_stats["modules"]), 1)
10:20:36  AssertionError: 42 != 1
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang

Michael137 added a commit that referenced this pull request Jun 24, 2025
#145494)

Reverts #144424

Caused CI failures.

macOS CI failure was:
```
10:20:36  FAIL: test_dwp_dwo_file_count (TestStats.TestCase)
10:20:36      Test "statistics dump" and the loaded dwo file count.
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 639, in test_dwp_dwo_file_count
10:20:36      self.assertEqual(debug_stats["totalDwoFileCount"], 2)
10:20:36  AssertionError: 0 != 2
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
10:20:36  ======================================================================
10:20:36  FAIL: test_no_debug_names_eager_loads_dwo_files (TestStats.TestCase)
10:20:36      Test the eager loading behavior of DWO files when debug_names is absent by
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 566, in test_no_debug_names_eager_loads_dwo_files
10:20:36      self.assertEqual(debug_stats["totalDwoFileCount"], 2)
10:20:36  AssertionError: 0 != 2
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
10:20:36  ======================================================================
10:20:36  FAIL: test_split_dwarf_dwo_file_count (TestStats.TestCase)
10:20:36      Test "statistics dump" and the dwo file count.
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 588, in test_split_dwarf_dwo_file_count
10:20:36      self.assertEqual(len(debug_stats["modules"]), 1)
10:20:36  AssertionError: 42 != 1
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
```
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jun 24, 2025
… statistics" (#145494)

Reverts llvm/llvm-project#144424

Caused CI failures.

macOS CI failure was:
```
10:20:36  FAIL: test_dwp_dwo_file_count (TestStats.TestCase)
10:20:36      Test "statistics dump" and the loaded dwo file count.
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 639, in test_dwp_dwo_file_count
10:20:36      self.assertEqual(debug_stats["totalDwoFileCount"], 2)
10:20:36  AssertionError: 0 != 2
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
10:20:36  ======================================================================
10:20:36  FAIL: test_no_debug_names_eager_loads_dwo_files (TestStats.TestCase)
10:20:36      Test the eager loading behavior of DWO files when debug_names is absent by
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 566, in test_no_debug_names_eager_loads_dwo_files
10:20:36      self.assertEqual(debug_stats["totalDwoFileCount"], 2)
10:20:36  AssertionError: 0 != 2
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
10:20:36  ======================================================================
10:20:36  FAIL: test_split_dwarf_dwo_file_count (TestStats.TestCase)
10:20:36      Test "statistics dump" and the dwo file count.
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 588, in test_split_dwarf_dwo_file_count
10:20:36      self.assertEqual(len(debug_stats["modules"]), 1)
10:20:36  AssertionError: 42 != 1
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
```
@DavidSpickett
Copy link
Collaborator

This also times out on Windows on Arm - https://lab.llvm.org/buildbot/#/builders/141/builds/9615.

First guess - it's waiting for DWO files to be loaded but we'll never use those on Windows?

DrSergei pushed a commit to DrSergei/llvm-project that referenced this pull request Jun 24, 2025
llvm#145494)

Reverts llvm#144424

Caused CI failures.

macOS CI failure was:
```
10:20:36  FAIL: test_dwp_dwo_file_count (TestStats.TestCase)
10:20:36      Test "statistics dump" and the loaded dwo file count.
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 639, in test_dwp_dwo_file_count
10:20:36      self.assertEqual(debug_stats["totalDwoFileCount"], 2)
10:20:36  AssertionError: 0 != 2
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
10:20:36  ======================================================================
10:20:36  FAIL: test_no_debug_names_eager_loads_dwo_files (TestStats.TestCase)
10:20:36      Test the eager loading behavior of DWO files when debug_names is absent by
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 566, in test_no_debug_names_eager_loads_dwo_files
10:20:36      self.assertEqual(debug_stats["totalDwoFileCount"], 2)
10:20:36  AssertionError: 0 != 2
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
10:20:36  ======================================================================
10:20:36  FAIL: test_split_dwarf_dwo_file_count (TestStats.TestCase)
10:20:36      Test "statistics dump" and the dwo file count.
10:20:36  ----------------------------------------------------------------------
10:20:36  Traceback (most recent call last):
10:20:36    File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 588, in test_split_dwarf_dwo_file_count
10:20:36      self.assertEqual(len(debug_stats["modules"]), 1)
10:20:36  AssertionError: 42 != 1
10:20:36  Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang
```
jeffreytan81 pushed a commit that referenced this pull request Jun 24, 2025
…144424" (#145572)

This relands changes in #144424 for adding a count of DWO files
parsed/loaded and the total number of DWO files.

The previous PR was reverted in #145494 due to the newly added unit
tests failing on Windows and MacOS CIs since these platforms don't
support DWO. This change add an additional
`@add_test_categories(["dwo"])` to the new tests to
[skip](https://github.com/llvm/llvm-project/blob/cd46354dbd10820158edabe14dbd49d9f9010722/lldb/packages/Python/lldbsuite/test/test_categories.py#L56)
these tests on Windows/MacOS.

Original PR: #144424

### Testing
Ran unit tests
```
$ bin/lldb-dotest -p TestStats.py llvm-project/lldb/test/API/commands/statistics/basic/
----------------------------------------------------------------------
Ran 24 tests in 211.391s

OK (skipped=3)
```
@vvereschaka
Copy link
Contributor

@jeffreytan81 ,

@vvereschaka, are you only seeing TestStats.py timeout on Windows or Linux as well? cc @qxy11 to follow-up.

I seen it on Windows (non-cross) only. The linux builds get passed without that problem.

@jeffreytan81
Copy link
Contributor

@jeffreytan81 ,

@vvereschaka, are you only seeing TestStats.py timeout on Windows or Linux as well? cc @qxy11 to follow-up.

I seen it on Windows (non-cross) only. The linux builds get passed without that problem.

Thanks for checking. We did a relanding only enabling the test for dwo category, #145572. Let us know if there are still issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants