-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb] (Begin to) support discontinuous lldb_private::Functions #115730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is the beginning of a different, more fundamental approach to handling. This PR tries to tries to minimize functional changes. It only makes sure that we store the true set of ranges inside the function object, so that subsequent patches can make use of it.
@llvm/pr-subscribers-lldb Author: Pavel Labath (labath) ChangesThis is the beginning of a different, more fundamental approach to handling. This PR tries to tries to minimize functional changes. It only makes sure that we store the true set of ranges inside the function object, so that subsequent patches can make use of it. Full diff: https://github.com/llvm/llvm-project/pull/115730.diff 11 Files Affected:
diff --git a/lldb/include/lldb/Symbol/Function.h b/lldb/include/lldb/Symbol/Function.h
index 8255c2baea7700..bbfc25fe74ea06 100644
--- a/lldb/include/lldb/Symbol/Function.h
+++ b/lldb/include/lldb/Symbol/Function.h
@@ -428,7 +428,7 @@ class Function : public UserID, public SymbolContextScope {
/// The section offset based address for this function.
Function(CompileUnit *comp_unit, lldb::user_id_t func_uid,
lldb::user_id_t func_type_uid, const Mangled &mangled,
- Type *func_type, const AddressRange &range);
+ Type *func_type, AddressRanges ranges);
/// Destructor.
~Function() override;
@@ -649,8 +649,12 @@ class Function : public UserID, public SymbolContextScope {
/// All lexical blocks contained in this function.
Block m_block;
+ /// List of address ranges belonging to the function.
+ AddressRanges m_ranges;
+
/// The function address range that covers the widest range needed to contain
- /// all blocks
+ /// all blocks. DEPRECATED: do not use this field in new code as the range may
+ /// include addresses belonging to other functions.
AddressRange m_range;
/// The frame base expression for variables that are relative to the frame
diff --git a/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp b/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
index 9e78ba8174e3d5..fadc19676609bf 100644
--- a/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
+++ b/lldb/source/Plugins/SymbolFile/Breakpad/SymbolFileBreakpad.cpp
@@ -255,7 +255,7 @@ FunctionSP SymbolFileBreakpad::GetOrCreateFunction(CompileUnit &comp_unit) {
section_sp, address - section_sp->GetFileAddress(), record->Size);
// Use the CU's id because every CU has only one function inside.
func_sp = std::make_shared<Function>(&comp_unit, id, 0, func_name,
- nullptr, func_range);
+ nullptr, AddressRanges{func_range});
comp_unit.AddFunction(func_sp);
}
}
diff --git a/lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp b/lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp
index bb738c3dcc54a0..15e8d38e7f334b 100644
--- a/lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp
+++ b/lldb/source/Plugins/SymbolFile/CTF/SymbolFileCTF.cpp
@@ -829,7 +829,7 @@ size_t SymbolFileCTF::ParseFunctions(CompileUnit &cu) {
lldb::user_id_t func_uid = m_functions.size();
FunctionSP function_sp = std::make_shared<Function>(
&cu, func_uid, function_type_uid, symbol->GetMangled(), type_sp.get(),
- func_range);
+ AddressRanges{func_range});
m_functions.emplace_back(function_sp);
cu.AddFunction(function_sp);
}
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h
index 971cbe47fb702d..80f7becc1b24ba 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParser.h
@@ -42,7 +42,7 @@ class DWARFASTParser {
virtual Function *ParseFunctionFromDWARF(CompileUnit &comp_unit,
const DWARFDIE &die,
- const AddressRange &range) = 0;
+ AddressRanges ranges) = 0;
virtual bool CompleteTypeFromDWARF(const DWARFDIE &die, Type *type,
const CompilerType &compiler_type) = 0;
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index a30d898a93cc4d..66b9013de34179 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -2343,9 +2343,8 @@ DWARFASTParserClang::ConstructDemangledNameFromDWARF(const DWARFDIE &die) {
Function *
DWARFASTParserClang::ParseFunctionFromDWARF(CompileUnit &comp_unit,
const DWARFDIE &die,
- const AddressRange &func_range) {
- assert(func_range.GetBaseAddress().IsValid());
- DWARFRangeList func_ranges;
+ AddressRanges func_ranges) {
+ DWARFRangeList unused_func_ranges;
const char *name = nullptr;
const char *mangled = nullptr;
std::optional<int> decl_file;
@@ -2361,9 +2360,9 @@ DWARFASTParserClang::ParseFunctionFromDWARF(CompileUnit &comp_unit,
if (tag != DW_TAG_subprogram)
return nullptr;
- if (die.GetDIENamesAndRanges(name, mangled, func_ranges, decl_file, decl_line,
- decl_column, call_file, call_line, call_column,
- &frame_base)) {
+ if (die.GetDIENamesAndRanges(name, mangled, unused_func_ranges, decl_file,
+ decl_line, decl_column, call_file, call_line,
+ call_column, &frame_base)) {
Mangled func_name;
if (mangled)
func_name.SetValue(ConstString(mangled));
@@ -2395,11 +2394,10 @@ DWARFASTParserClang::ParseFunctionFromDWARF(CompileUnit &comp_unit,
assert(func_type == nullptr || func_type != DIE_IS_BEING_PARSED);
const user_id_t func_user_id = die.GetID();
- func_sp =
- std::make_shared<Function>(&comp_unit,
- func_user_id, // UserID is the DIE offset
- func_user_id, func_name, func_type,
- func_range); // first address range
+ func_sp = std::make_shared<Function>(
+ &comp_unit,
+ func_user_id, // UserID is the DIE offset
+ func_user_id, func_name, func_type, std::move(func_ranges));
if (func_sp.get() != nullptr) {
if (frame_base.IsValid())
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
index 1ffb09b2fc1942..6c3aac6d452e49 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
@@ -59,7 +59,7 @@ class DWARFASTParserClang : public lldb_private::plugin::dwarf::DWARFASTParser {
lldb_private::Function *
ParseFunctionFromDWARF(lldb_private::CompileUnit &comp_unit,
const lldb_private::plugin::dwarf::DWARFDIE &die,
- const lldb_private::AddressRange &func_range) override;
+ lldb_private::AddressRanges func_ranges) override;
bool CompleteTypeFromDWARF(
const lldb_private::plugin::dwarf::DWARFDIE &die,
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index e19c490bf44d3f..30553567fe1fea 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -918,32 +918,21 @@ Function *SymbolFileDWARF::ParseFunction(CompileUnit &comp_unit,
if (!dwarf_ast)
return nullptr;
- DWARFRangeList ranges = die.GetDIE()->GetAttributeAddressRanges(
- die.GetCU(), /*check_hi_lo_pc=*/true);
- if (ranges.IsEmpty())
- return nullptr;
-
- // Union of all ranges in the function DIE (if the function is
- // discontiguous)
- lldb::addr_t lowest_func_addr = ranges.GetMinRangeBase(0);
- lldb::addr_t highest_func_addr = ranges.GetMaxRangeEnd(0);
- if (lowest_func_addr == LLDB_INVALID_ADDRESS ||
- lowest_func_addr >= highest_func_addr ||
- lowest_func_addr < m_first_code_address)
- return nullptr;
+ AddressRanges ranges;
ModuleSP module_sp(die.GetModule());
- AddressRange func_range;
- func_range.GetBaseAddress().ResolveAddressUsingFileSections(
- lowest_func_addr, module_sp->GetSectionList());
- if (!func_range.GetBaseAddress().IsValid())
- return nullptr;
-
- func_range.SetByteSize(highest_func_addr - lowest_func_addr);
- if (!FixupAddress(func_range.GetBaseAddress()))
+ for (const auto &range : die.GetDIE()->GetAttributeAddressRanges(
+ die.GetCU(), /*check_hi_lo_pc=*/true)) {
+ if (range.base < m_first_code_address)
+ continue;
+ if (Address base_addr(range.base, module_sp->GetSectionList());
+ base_addr.IsValid() && FixupAddress(base_addr))
+ ranges.emplace_back(std::move(base_addr), range.size);
+ }
+ if (ranges.empty())
return nullptr;
- return dwarf_ast->ParseFunctionFromDWARF(comp_unit, die, func_range);
+ return dwarf_ast->ParseFunctionFromDWARF(comp_unit, die, std::move(ranges));
}
ConstString
diff --git a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
index c784c2e28f6452..c0416b4d06815d 100644
--- a/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp
@@ -504,7 +504,7 @@ lldb::FunctionSP SymbolFileNativePDB::CreateFunction(PdbCompilandSymId func_id,
Mangled mangled(proc.Name);
FunctionSP func_sp = std::make_shared<Function>(
&comp_unit, toOpaqueUid(func_id), toOpaqueUid(sig_id), mangled,
- func_type.get(), func_range);
+ func_type.get(), AddressRanges{func_range});
comp_unit.AddFunction(func_sp);
diff --git a/lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp b/lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
index 4fc48b4d133382..c7b23aedfdccac 100644
--- a/lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
+++ b/lldb/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
@@ -310,9 +310,9 @@ SymbolFilePDB::ParseCompileUnitFunctionForPDBFunc(const PDBSymbolFunc &pdb_func,
Mangled mangled = GetMangledForPDBFunc(pdb_func);
- FunctionSP func_sp =
- std::make_shared<Function>(&comp_unit, pdb_func.getSymIndexId(),
- func_type_uid, mangled, func_type, func_range);
+ FunctionSP func_sp = std::make_shared<Function>(
+ &comp_unit, pdb_func.getSymIndexId(), func_type_uid, mangled, func_type,
+ AddressRanges{func_range});
comp_unit.AddFunction(func_sp);
diff --git a/lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp b/lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp
index 8c17017442b1f8..e1e11d274d6e40 100644
--- a/lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp
+++ b/lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp
@@ -186,7 +186,7 @@ size_t SymbolFileSymtab::ParseFunctions(CompileUnit &comp_unit) {
// for this function
curr_symbol->GetMangled(), // Linker/mangled name
nullptr, // no return type for a code symbol...
- func_range)); // first address range
+ AddressRanges{func_range}));
if (func_sp.get() != nullptr) {
comp_unit.AddFunction(func_sp);
diff --git a/lldb/source/Symbol/Function.cpp b/lldb/source/Symbol/Function.cpp
index 96d8322b43d843..fd5346514022b7 100644
--- a/lldb/source/Symbol/Function.cpp
+++ b/lldb/source/Symbol/Function.cpp
@@ -254,12 +254,32 @@ Function *IndirectCallEdge::GetCallee(ModuleList &images,
/// @}
+AddressRange CollapseRanges(llvm::ArrayRef<AddressRange> ranges) {
+ if (ranges.empty())
+ return AddressRange();
+ if (ranges.size() == 1)
+ return ranges[0];
+
+ Address lowest_addr = ranges[0].GetBaseAddress();
+ addr_t highest_addr = lowest_addr.GetFileAddress() + ranges[0].GetByteSize();
+ for (const AddressRange &range: ranges.drop_front()) {
+ Address range_begin = range.GetBaseAddress();
+ addr_t range_end = range_begin.GetFileAddress() + range.GetByteSize();
+ if (range_begin.GetFileAddress() < lowest_addr.GetFileAddress())
+ lowest_addr = range_begin;
+ if (range_end > highest_addr)
+ highest_addr = range_end;
+ }
+ return AddressRange(lowest_addr, highest_addr - lowest_addr.GetFileAddress());
+}
+
//
Function::Function(CompileUnit *comp_unit, lldb::user_id_t func_uid,
lldb::user_id_t type_uid, const Mangled &mangled, Type *type,
- const AddressRange &range)
+ AddressRanges ranges)
: UserID(func_uid), m_comp_unit(comp_unit), m_type_uid(type_uid),
- m_type(type), m_mangled(mangled), m_block(func_uid), m_range(range),
+ m_type(type), m_mangled(mangled), m_block(func_uid),
+ m_ranges(std::move(ranges)), m_range(CollapseRanges(m_ranges)),
m_frame_base(), m_flags(), m_prologue_byte_size(0) {
m_block.SetParentScope(this);
assert(comp_unit != nullptr);
@@ -406,14 +426,13 @@ void Function::GetDescription(Stream *s, lldb::DescriptionLevel level,
llvm::interleaveComma(decl_context, *s, [&](auto &ctx) { ctx.Dump(*s); });
*s << "}";
}
- *s << ", range = ";
- Address::DumpStyle fallback_style;
- if (level == eDescriptionLevelVerbose)
- fallback_style = Address::DumpStyleModuleWithFileAddress;
- else
- fallback_style = Address::DumpStyleFileAddress;
- GetAddressRange().Dump(s, target, Address::DumpStyleLoadAddress,
- fallback_style);
+ *s << ", range" << (m_ranges.size() > 1 ? "s" : "") << " = ";
+ Address::DumpStyle fallback_style =
+ level == eDescriptionLevelVerbose
+ ? Address::DumpStyleModuleWithFileAddress
+ : Address::DumpStyleFileAddress;
+ for (const AddressRange &range : m_ranges)
+ range.Dump(s, target, Address::DumpStyleLoadAddress, fallback_style);
}
void Function::Dump(Stream *s, bool show_context) const {
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
addr_t highest_addr = lowest_addr.GetFileAddress() + ranges[0].GetByteSize(); | ||
for (const AddressRange &range : ranges.drop_front()) { | ||
Address range_begin = range.GetBaseAddress(); | ||
addr_t range_end = range_begin.GetFileAddress() + range.GetByteSize(); | ||
if (range_begin.GetFileAddress() < lowest_addr.GetFileAddress()) | ||
lowest_addr = range_begin; | ||
if (range_end > highest_addr) | ||
highest_addr = range_end; | ||
} | ||
return AddressRange(lowest_addr, highest_addr - lowest_addr.GetFileAddress()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't help but think you don't actually need the highest_addr
and you could just keep extending the byte size of the range, but maybe I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the range may need extending at both ends (there's no guarantee the lowest address will be the first one). Extending it an the lower end is tricky as you need to both change the base address and increase the range size. Storing it decomposed and only constructing the range object at the end is simpler.
…#115730) This is the beginning of a different, more fundamental approach to handling. This PR tries to tries to minimize functional changes. It only makes sure that we store the true set of ranges inside the function object, so that subsequent patches can make use of it.
This is a follow-up/reimplementation of llvm#115730. While working on that patch, I did not realize that the correct (discontinuous) set of ranges is already stored in the block representing the whole function. The catch -- ranges for this block are only set later, when parsing all of the blocks of the function. This patch changes that by populating the function block ranges eagerly -- from within the Function constructor. This also necessitates a corresponding change in all of the symbol files -- so that they stop populating the ranges of that block. This allows us to avoid some unnecessary work (not parsing the function DW_AT_ranges twice) and also results in some simplification of the parsing code.
This is a follow-up/reimplementation of #115730. While working on that patch, I did not realize that the correct (discontinuous) set of ranges is already stored in the block representing the whole function. The catch -- ranges for this block are only set later, when parsing all of the blocks of the function. This patch changes that by populating the function block ranges eagerly -- from within the Function constructor. This also necessitates a corresponding change in all of the symbol files -- so that they stop populating the ranges of that block. This allows us to avoid some unnecessary work (not parsing the function DW_AT_ranges twice) and also results in some simplification of the parsing code.
This is the beginning of a different, more fundamental approach to handling. This PR tries to tries to minimize functional changes. It only makes sure that we store the true set of ranges inside the function object, so that subsequent patches can make use of it.