Skip to content

[BOLT][DWARF] Deduplicate Foreign TU list #97629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 4, 2024
Merged

Conversation

ayermolo
Copy link
Contributor

@ayermolo ayermolo commented Jul 3, 2024

There could be multiple TUs with the same hash in various DWO files. In bigger binaries this could be in the thousands. Although they could be structurally different and we need to output Entries for all of them, for the purposes of figuring out a TU hash we only need one entry in Foreign TU list.

Summary:
There could be multiple TUs with the same hash in various DWO files. In bigger
binaries this could be in the thousands. Although they could be structurally
different and we need to output Entries for all of them, for the purposes of
figuring out a TU hash we only need one entry in Foreign TU list.

Test Plan: ninja check-bolt

Reviewers: #llvm-bolt

Differential Revision: https://phabricator.intern.facebook.com/D59343146

Tasks: T188628025
@llvmbot
Copy link
Member

llvmbot commented Jul 3, 2024

@llvm/pr-subscribers-bolt

Author: Alexander Yermolovich (ayermolo)

Changes

There could be multiple TUs with the same hash in various DWO files. In bigger binaries this could be in the thousands. Although they could be structurally different and we need to output Entries for all of them, for the purposes of figuring out a TU hash we only need one entry in Foreign TU list.


Full diff: https://github.com/llvm/llvm-project/pull/97629.diff

3 Files Affected:

  • (modified) bolt/include/bolt/Core/DebugNames.h (+4)
  • (modified) bolt/lib/Core/DebugNames.cpp (+12-3)
  • (modified) bolt/test/X86/dwarf5-df-types-debug-names.test (+5-6)
diff --git a/bolt/include/bolt/Core/DebugNames.h b/bolt/include/bolt/Core/DebugNames.h
index a14a30529fad5..e0a8103802de9 100644
--- a/bolt/include/bolt/Core/DebugNames.h
+++ b/bolt/include/bolt/Core/DebugNames.h
@@ -91,6 +91,10 @@ class DWARF5AcceleratorTable {
   uint64_t CurrentUnitOffset = 0;
   const DWARFUnit *CurrentUnit = nullptr;
   std::unordered_map<uint32_t, uint32_t> AbbrevTagToIndexMap;
+  /// Contains a map of TU hashes to a Foreign TU indecies.
+  /// This is used to reduce the size of Foreign TU list since there could be
+  /// multiple TUs with the same hash.
+  std::unordered_map<uint64_t, uint32_t> TUHashToIndexMap;
 
   /// Represents a group of entries with identical name (and hence, hash value).
   struct HashData {
diff --git a/bolt/lib/Core/DebugNames.cpp b/bolt/lib/Core/DebugNames.cpp
index ebe895e019ccb..640b29ec36d5c 100644
--- a/bolt/lib/Core/DebugNames.cpp
+++ b/bolt/lib/Core/DebugNames.cpp
@@ -90,7 +90,11 @@ void DWARF5AcceleratorTable::addUnit(DWARFUnit &Unit,
       auto Iter = CUOffsetsToPatch.insert({*DWOID, CUList.size()});
       if (Iter.second)
         CUList.push_back(BADCUOFFSET);
-      ForeignTUList.push_back(cast<DWARFTypeUnit>(&Unit)->getTypeHash());
+      const uint64_t TUHash = cast<DWARFTypeUnit>(&Unit)->getTypeHash();
+      if (!TUHashToIndexMap.count(TUHash)) {
+        TUHashToIndexMap.insert({TUHash, ForeignTUList.size()});
+        ForeignTUList.push_back(TUHash);
+      }
     } else {
       LocalTUList.push_back(CurrentUnitOffset);
     }
@@ -231,8 +235,13 @@ DWARF5AcceleratorTable::addAccelTableEntry(
     IsTU = Unit.isTypeUnit();
     DieTag = Die.getTag();
     if (IsTU) {
-      if (DWOID)
-        return ForeignTUList.size() - 1;
+      if (DWOID) {
+        const uint64_t TUHash = cast<DWARFTypeUnit>(&Unit)->getTypeHash();
+        auto Iter = TUHashToIndexMap.find(TUHash);
+        assert(Iter != TUHashToIndexMap.end() &&
+               "Could not find TU hash in map");
+        return Iter->second;
+      }
       return LocalTUList.size() - 1;
     }
     return CUList.size() - 1;
diff --git a/bolt/test/X86/dwarf5-df-types-debug-names.test b/bolt/test/X86/dwarf5-df-types-debug-names.test
index f5a2c9c10353e..7c1c8e4fd5b38 100644
--- a/bolt/test/X86/dwarf5-df-types-debug-names.test
+++ b/bolt/test/X86/dwarf5-df-types-debug-names.test
@@ -18,19 +18,19 @@
 ; BOLT: type_signature = [[TYPE1:0x[0-9a-f]*]]
 ; BOLT: Compile Unit
 ; BOLT: type_signature = [[TYPE2:0x[0-9a-f]*]]
-; BOLT: type_signature = [[TYPE3:0x[0-9a-f]*]]
+; BOLT: type_signature = [[TYPE1]]
 ; BOLT: Compile Unit
 ; BOLT: [[OFFSET:0x[0-9a-f]*]]: Compile Unit
 ; BOLT: [[OFFSET1:0x[0-9a-f]*]]: Compile Unit
 
 ; BOLT:       Name Index @ 0x0 {
 ; BOLT-NEXT:   Header {
-; BOLT-NEXT:     Length: 0x17E
+; BOLT-NEXT:     Length: 0x176
 ; BOLT-NEXT:     Format: DWARF32
 ; BOLT-NEXT:     Version: 5
 ; BOLT-NEXT:     CU count: 2
 ; BOLT-NEXT:     Local TU count: 0
-; BOLT-NEXT:     Foreign TU count: 4
+; BOLT-NEXT:     Foreign TU count: 3
 ; BOLT-NEXT:     Bucket count: 9
 ; BOLT-NEXT:     Name count: 9
 ; BOLT-NEXT:     Abbreviations table size: 0x37
@@ -44,7 +44,6 @@
 ; BOLT-NEXT:     ForeignTU[0]: [[TYPE]]
 ; BOLT-NEXT:     ForeignTU[1]: [[TYPE1]]
 ; BOLT-NEXT:     ForeignTU[2]: [[TYPE2]]
-; BOLT-NEXT:     ForeignTU[3]: [[TYPE3]]
 ; BOLT-NEXT:   ]
 ; BOLT-NEXT: Abbreviations [
 ; BOLT-NEXT:     Abbreviation [[ABBREV:0x[0-9a-f]*]] {
@@ -173,7 +172,7 @@
 ; BOLT-NEXT:       Entry @ {{.+}} {
 ; BOLT-NEXT:         Abbrev: [[ABBREV]]
 ; BOLT-NEXT:         Tag: DW_TAG_structure_type
-; BOLT-NEXT:         DW_IDX_type_unit: 0x03
+; BOLT-NEXT:         DW_IDX_type_unit: 0x01
 ; BOLT-NEXT:         DW_IDX_compile_unit: 0x01
 ; BOLT-NEXT:         DW_IDX_die_offset: 0x00000021
 ; BOLT-NEXT:         DW_IDX_parent: <parent not indexed>
@@ -237,7 +236,7 @@
 ; BOLT-NEXT:       Entry @ {{.+}} {
 ; BOLT-NEXT:         Abbrev: 0x5
 ; BOLT-NEXT:         Tag: DW_TAG_base_type
-; BOLT-NEXT:         DW_IDX_type_unit: 0x03
+; BOLT-NEXT:         DW_IDX_type_unit: 0x01
 ; BOLT-NEXT:         DW_IDX_compile_unit: 0x01
 ; BOLT-NEXT:         DW_IDX_die_offset: 0x00000048
 ; BOLT-NEXT:         DW_IDX_parent: <parent not indexed>

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

@ayermolo ayermolo merged commit 361350f into llvm:main Jul 4, 2024
6 checks passed
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
There could be multiple TUs with the same hash in various DWO files. In
bigger binaries this could be in the thousands. Although they could be
structurally different and we need to output Entries for all of them,
for the purposes of figuring out a TU hash we only need one entry in
Foreign TU list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants