Skip to content

[lldb] (Begin to) support discontinuous lldb_private::Functions #115730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 12, 2024

Conversation

labath
Copy link
Collaborator

@labath labath commented Nov 11, 2024

This is the beginning of a different, more fundamental approach to handling. This PR tries to tries to minimize functional changes. It only makes sure that we store the true set of ranges inside the function object, so that subsequent patches can make use of it.

This is the beginning of a different, more fundamental approach to
handling. This PR tries to tries to minimize functional changes. It only
makes sure that we store the true set of ranges inside the function
object, so that subsequent patches can make use of it.
@labath labath requested a review from jasonmolenda November 11, 2024 15:59
@labath labath requested a review from JDevlieghere as a code owner November 11, 2024 15:59
@llvmbot llvmbot added the lldb label Nov 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 11, 2024

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

This is the beginning of a different, more fundamental approach to handling. This PR tries to tries to minimize functional changes. It only makes sure that we store the true set of ranges inside the function object, so that subsequent patches can make use of it.


Full diff: https://github.com/llvm/llvm-project/pull/115730.diff

11 Files Affected:

  • (modified) lldb/include/lldb/Symbol/Function.h (+6-2)
  • (modified) lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+9-11)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+11-22)
  • (modified) lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp (+1-1)
  • (modified) lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp (+3-3)
  • (modified) lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp (+1-1)
  • (modified) lldb/source/Symbol/Function.cpp (+29-10)
diff --git a/lldb/include/lldb/Symbol/Function.h b/lldb/include/lldb/Symbol/Function.h
index 8255c2baea7700..bbfc25fe74ea06 100644
--- a/lldb/include/lldb/Symbol/Function.h
+++ b/lldb/include/lldb/Symbol/Function.h
@@ -428,7 +428,7 @@ class Function : public UserID, public SymbolContextScope {
   ///     The section offset based address for this function.
   Function(CompileUnit *comp_unit, lldb::user_id_t func_uid,
            lldb::user_id_t func_type_uid, const Mangled &mangled,
-           Type *func_type, const AddressRange &range);
+           Type *func_type, AddressRanges ranges);
 
   /// Destructor.
   ~Function() override;
@@ -649,8 +649,12 @@ class Function : public UserID, public SymbolContextScope {
   /// All lexical blocks contained in this function.
   Block m_block;
 
+  /// List of address ranges belonging to the function.
+  AddressRanges m_ranges;
+
   /// The function address range that covers the widest range needed to contain
-  /// all blocks
+  /// all blocks. DEPRECATED: do not use this field in new code as the range may
+  /// include addresses belonging to other functions.
   AddressRange m_range;
 
   /// The frame base expression for variables that are relative to the frame
diff --git a/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp b/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
index 9e78ba8174e3d5..fadc19676609bf 100644
--- a/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
+++ b/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
@@ -255,7 +255,7 @@ FunctionSP SymbolFileBreakpad::GetOrCreateFunction(CompileUnit &comp_unit) {
           section_sp, address - section_sp->GetFileAddress(), record->Size);
       // Use the CU's id because every CU has only one function inside.
       func_sp = std::make_shared<Function>(&comp_unit, id, 0, func_name,
-                                           nullptr, func_range);
+                                           nullptr, AddressRanges{func_range});
       comp_unit.AddFunction(func_sp);
     }
   }
diff --git a/lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp b/lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp
index bb738c3dcc54a0..15e8d38e7f334b 100644
--- a/lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp
+++ b/lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp
@@ -829,7 +829,7 @@ size_t SymbolFileCTF::ParseFunctions(CompileUnit &cu) {
       lldb::user_id_t func_uid = m_functions.size();
       FunctionSP function_sp = std::make_shared<Function>(
           &cu, func_uid, function_type_uid, symbol->GetMangled(), type_sp.get(),
-          func_range);
+          AddressRanges{func_range});
       m_functions.emplace_back(function_sp);
       cu.AddFunction(function_sp);
     }
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h
index 971cbe47fb702d..80f7becc1b24ba 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h
@@ -42,7 +42,7 @@ class DWARFASTParser {
 
   virtual Function *ParseFunctionFromDWARF(CompileUnit &comp_unit,
                                            const DWARFDIE &die,
-                                           const AddressRange &range) = 0;
+                                           AddressRanges ranges) = 0;
 
   virtual bool CompleteTypeFromDWARF(const DWARFDIE &die, Type *type,
                                      const CompilerType &compiler_type) = 0;
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index a30d898a93cc4d..66b9013de34179 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -2343,9 +2343,8 @@ DWARFASTParserClang::ConstructDemangledNameFromDWARF(const DWARFDIE &die) {
 Function *
 DWARFASTParserClang::ParseFunctionFromDWARF(CompileUnit &comp_unit,
                                             const DWARFDIE &die,
-                                            const AddressRange &func_range) {
-  assert(func_range.GetBaseAddress().IsValid());
-  DWARFRangeList func_ranges;
+                                            AddressRanges func_ranges) {
+  DWARFRangeList unused_func_ranges;
   const char *name = nullptr;
   const char *mangled = nullptr;
   std::optional<int> decl_file;
@@ -2361,9 +2360,9 @@ DWARFASTParserClang::ParseFunctionFromDWARF(CompileUnit &comp_unit,
   if (tag != DW_TAG_subprogram)
     return nullptr;
 
-  if (die.GetDIENamesAndRanges(name, mangled, func_ranges, decl_file, decl_line,
-                               decl_column, call_file, call_line, call_column,
-                               &frame_base)) {
+  if (die.GetDIENamesAndRanges(name, mangled, unused_func_ranges, decl_file,
+                               decl_line, decl_column, call_file, call_line,
+                               call_column, &frame_base)) {
     Mangled func_name;
     if (mangled)
       func_name.SetValue(ConstString(mangled));
@@ -2395,11 +2394,10 @@ DWARFASTParserClang::ParseFunctionFromDWARF(CompileUnit &comp_unit,
     assert(func_type == nullptr || func_type != DIE_IS_BEING_PARSED);
 
     const user_id_t func_user_id = die.GetID();
-    func_sp =
-        std::make_shared<Function>(&comp_unit,
-                                   func_user_id, // UserID is the DIE offset
-                                   func_user_id, func_name, func_type,
-                                   func_range); // first address range
+    func_sp = std::make_shared<Function>(
+        &comp_unit,
+        func_user_id, // UserID is the DIE offset
+        func_user_id, func_name, func_type, std::move(func_ranges));
 
     if (func_sp.get() != nullptr) {
       if (frame_base.IsValid())
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
index 1ffb09b2fc1942..6c3aac6d452e49 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
@@ -59,7 +59,7 @@ class DWARFASTParserClang : public lldb_private::plugin::dwarf::DWARFASTParser {
   lldb_private::Function *
   ParseFunctionFromDWARF(lldb_private::CompileUnit &comp_unit,
                          const lldb_private::plugin::dwarf::DWARFDIE &die,
-                         const lldb_private::AddressRange &func_range) override;
+                         lldb_private::AddressRanges func_ranges) override;
 
   bool CompleteTypeFromDWARF(
       const lldb_private::plugin::dwarf::DWARFDIE &die,
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index e19c490bf44d3f..30553567fe1fea 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -918,32 +918,21 @@ Function *SymbolFileDWARF::ParseFunction(CompileUnit &comp_unit,
   if (!dwarf_ast)
     return nullptr;
 
-  DWARFRangeList ranges = die.GetDIE()->GetAttributeAddressRanges(
-      die.GetCU(), /*check_hi_lo_pc=*/true);
-  if (ranges.IsEmpty())
-    return nullptr;
-
-  // Union of all ranges in the function DIE (if the function is
-  // discontiguous)
-  lldb::addr_t lowest_func_addr = ranges.GetMinRangeBase(0);
-  lldb::addr_t highest_func_addr = ranges.GetMaxRangeEnd(0);
-  if (lowest_func_addr == LLDB_INVALID_ADDRESS ||
-      lowest_func_addr >= highest_func_addr ||
-      lowest_func_addr < m_first_code_address)
-    return nullptr;
 
+  AddressRanges ranges;
   ModuleSP module_sp(die.GetModule());
-  AddressRange func_range;
-  func_range.GetBaseAddress().ResolveAddressUsingFileSections(
-      lowest_func_addr, module_sp->GetSectionList());
-  if (!func_range.GetBaseAddress().IsValid())
-    return nullptr;
-
-  func_range.SetByteSize(highest_func_addr - lowest_func_addr);
-  if (!FixupAddress(func_range.GetBaseAddress()))
+  for (const auto &range : die.GetDIE()->GetAttributeAddressRanges(
+           die.GetCU(), /*check_hi_lo_pc=*/true)) {
+    if (range.base < m_first_code_address)
+      continue;
+    if (Address base_addr(range.base, module_sp->GetSectionList());
+        base_addr.IsValid() && FixupAddress(base_addr))
+      ranges.emplace_back(std::move(base_addr), range.size);
+  }
+  if (ranges.empty())
     return nullptr;
 
-  return dwarf_ast->ParseFunctionFromDWARF(comp_unit, die, func_range);
+  return dwarf_ast->ParseFunctionFromDWARF(comp_unit, die, std::move(ranges));
 }
 
 ConstString
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
index c784c2e28f6452..c0416b4d06815d 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
@@ -504,7 +504,7 @@ lldb::FunctionSP SymbolFileNativePDB::CreateFunction(PdbCompilandSymId func_id,
   Mangled mangled(proc.Name);
   FunctionSP func_sp = std::make_shared<Function>(
       &comp_unit, toOpaqueUid(func_id), toOpaqueUid(sig_id), mangled,
-      func_type.get(), func_range);
+      func_type.get(), AddressRanges{func_range});
 
   comp_unit.AddFunction(func_sp);
 
diff --git a/lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp b/lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
index 4fc48b4d133382..c7b23aedfdccac 100644
--- a/lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
@@ -310,9 +310,9 @@ SymbolFilePDB::ParseCompileUnitFunctionForPDBFunc(const PDBSymbolFunc &pdb_func,
 
   Mangled mangled = GetMangledForPDBFunc(pdb_func);
 
-  FunctionSP func_sp =
-      std::make_shared<Function>(&comp_unit, pdb_func.getSymIndexId(),
-                                 func_type_uid, mangled, func_type, func_range);
+  FunctionSP func_sp = std::make_shared<Function>(
+      &comp_unit, pdb_func.getSymIndexId(), func_type_uid, mangled, func_type,
+      AddressRanges{func_range});
 
   comp_unit.AddFunction(func_sp);
 
diff --git a/lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp b/lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp
index 8c17017442b1f8..e1e11d274d6e40 100644
--- a/lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp
+++ b/lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp
@@ -186,7 +186,7 @@ size_t SymbolFileSymtab::ParseFunctions(CompileUnit &comp_unit) {
                                                // for this function
                              curr_symbol->GetMangled(), // Linker/mangled name
                              nullptr, // no return type for a code symbol...
-                             func_range)); // first address range
+                             AddressRanges{func_range}));
 
             if (func_sp.get() != nullptr) {
               comp_unit.AddFunction(func_sp);
diff --git a/lldb/source/Symbol/Function.cpp b/lldb/source/Symbol/Function.cpp
index 96d8322b43d843..fd5346514022b7 100644
--- a/lldb/source/Symbol/Function.cpp
+++ b/lldb/source/Symbol/Function.cpp
@@ -254,12 +254,32 @@ Function *IndirectCallEdge::GetCallee(ModuleList &images,
 
 /// @}
 
+AddressRange CollapseRanges(llvm::ArrayRef<AddressRange> ranges) {
+  if (ranges.empty())
+    return AddressRange();
+  if (ranges.size() == 1)
+    return ranges[0];
+
+  Address lowest_addr = ranges[0].GetBaseAddress();
+  addr_t highest_addr = lowest_addr.GetFileAddress() + ranges[0].GetByteSize();
+  for (const AddressRange &range: ranges.drop_front()) {
+    Address range_begin = range.GetBaseAddress();
+    addr_t range_end = range_begin.GetFileAddress() + range.GetByteSize();
+    if (range_begin.GetFileAddress() < lowest_addr.GetFileAddress())
+      lowest_addr = range_begin;
+    if (range_end > highest_addr)
+      highest_addr = range_end;
+  }
+  return AddressRange(lowest_addr, highest_addr - lowest_addr.GetFileAddress());
+}
+
 //
 Function::Function(CompileUnit *comp_unit, lldb::user_id_t func_uid,
                    lldb::user_id_t type_uid, const Mangled &mangled, Type *type,
-                   const AddressRange &range)
+                   AddressRanges ranges)
     : UserID(func_uid), m_comp_unit(comp_unit), m_type_uid(type_uid),
-      m_type(type), m_mangled(mangled), m_block(func_uid), m_range(range),
+      m_type(type), m_mangled(mangled), m_block(func_uid),
+      m_ranges(std::move(ranges)), m_range(CollapseRanges(m_ranges)),
       m_frame_base(), m_flags(), m_prologue_byte_size(0) {
   m_block.SetParentScope(this);
   assert(comp_unit != nullptr);
@@ -406,14 +426,13 @@ void Function::GetDescription(Stream *s, lldb::DescriptionLevel level,
     llvm::interleaveComma(decl_context, *s, [&](auto &ctx) { ctx.Dump(*s); });
     *s << "}";
   }
-  *s << ", range = ";
-  Address::DumpStyle fallback_style;
-  if (level == eDescriptionLevelVerbose)
-    fallback_style = Address::DumpStyleModuleWithFileAddress;
-  else
-    fallback_style = Address::DumpStyleFileAddress;
-  GetAddressRange().Dump(s, target, Address::DumpStyleLoadAddress,
-                         fallback_style);
+  *s << ", range" << (m_ranges.size() > 1 ? "s" : "") << " = ";
+  Address::DumpStyle fallback_style =
+      level == eDescriptionLevelVerbose
+          ? Address::DumpStyleModuleWithFileAddress
+          : Address::DumpStyleFileAddress;
+  for (const AddressRange &range : m_ranges)
+    range.Dump(s, target, Address::DumpStyleLoadAddress, fallback_style);
 }
 
 void Function::Dump(Stream *s, bool show_context) const {

Copy link

github-actions bot commented Nov 11, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@JDevlieghere JDevlieghere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +264 to +273
addr_t highest_addr = lowest_addr.GetFileAddress() + ranges[0].GetByteSize();
for (const AddressRange &range : ranges.drop_front()) {
Address range_begin = range.GetBaseAddress();
addr_t range_end = range_begin.GetFileAddress() + range.GetByteSize();
if (range_begin.GetFileAddress() < lowest_addr.GetFileAddress())
lowest_addr = range_begin;
if (range_end > highest_addr)
highest_addr = range_end;
}
return AddressRange(lowest_addr, highest_addr - lowest_addr.GetFileAddress());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't help but think you don't actually need the highest_addr and you could just keep extending the byte size of the range, but maybe I'm missing something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that the range may need extending at both ends (there's no guarantee the lowest address will be the first one). Extending it an the lower end is tricky as you need to both change the base address and increase the range size. Storing it decomposed and only constructing the range object at the end is simpler.

@labath labath merged commit 2a3c08f into llvm:main Nov 12, 2024
7 checks passed
@labath labath deleted the range branch November 12, 2024 08:35
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
…#115730)

This is the beginning of a different, more fundamental approach to
handling. This PR tries to tries to minimize functional changes. It only
makes sure that we store the true set of ranges inside the function
object, so that subsequent patches can make use of it.
labath added a commit to labath/llvm-project that referenced this pull request Nov 28, 2024
This is a follow-up/reimplementation of llvm#115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.

This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
labath added a commit that referenced this pull request Dec 3, 2024
This is a follow-up/reimplementation of #115730. While working on that
patch, I did not realize that the correct (discontinuous) set of ranges
is already stored in the block representing the whole function. The
catch -- ranges for this block are only set later, when parsing all of
the blocks of the function.

This patch changes that by populating the function block ranges eagerly
-- from within the Function constructor. This also necessitates a
corresponding change in all of the symbol files -- so that they stop
populating the ranges of that block. This allows us to avoid some
unnecessary work (not parsing the function DW_AT_ranges twice) and also
results in some simplification of the parsing code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants