Skip to content

[lldb] Parse DWARF CFI for discontinuous functions #137006

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 7, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions lldb/include/lldb/Symbol/DWARFCallFrameInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,15 @@ class DWARFCallFrameInfo {
/// Return an UnwindPlan based on the call frame information encoded in the
/// FDE of this DWARFCallFrameInfo section. The returned plan will be valid
/// (at least) for the given address.
bool GetUnwindPlan(const Address &addr, UnwindPlan &unwind_plan);
std::unique_ptr<UnwindPlan> GetUnwindPlan(const Address &addr);

/// Return an UnwindPlan based on the call frame information encoded in the
/// FDE of this DWARFCallFrameInfo section. The returned plan will be valid
/// (at least) for some address in the given range.
bool GetUnwindPlan(const AddressRange &range, UnwindPlan &unwind_plan);
/// (at least) for some address in the given ranges. If no unwind information
/// is found, nullptr is returned. \a addr represents the entry point of the
/// function. It corresponds to the offset zero in the returned UnwindPlan.
std::unique_ptr<UnwindPlan> GetUnwindPlan(llvm::ArrayRef<AddressRange> ranges,
const Address &addr);

typedef RangeVector<lldb::addr_t, uint32_t> FunctionAddressAndSizeVector;

Expand Down
64 changes: 34 additions & 30 deletions lldb/source/Symbol/DWARFCallFrameInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -151,53 +151,57 @@ DWARFCallFrameInfo::DWARFCallFrameInfo(ObjectFile &objfile,
SectionSP &section_sp, Type type)
: m_objfile(objfile), m_section_sp(section_sp), m_type(type) {}

bool DWARFCallFrameInfo::GetUnwindPlan(const Address &addr,
UnwindPlan &unwind_plan) {
return GetUnwindPlan(AddressRange(addr, 1), unwind_plan);
std::unique_ptr<UnwindPlan>
DWARFCallFrameInfo::GetUnwindPlan(const Address &addr) {
return GetUnwindPlan({AddressRange(addr, 1)}, addr);
}

bool DWARFCallFrameInfo::GetUnwindPlan(const AddressRange &range,
UnwindPlan &unwind_plan) {
std::unique_ptr<UnwindPlan>
DWARFCallFrameInfo::GetUnwindPlan(llvm::ArrayRef<AddressRange> ranges,
const Address &addr) {
FDEEntryMap::Entry fde_entry;
Address addr = range.GetBaseAddress();

// Make sure that the Address we're searching for is the same object file as
// this DWARFCallFrameInfo, we only store File offsets in m_fde_index.
ModuleSP module_sp = addr.GetModule();
if (module_sp.get() == nullptr || module_sp->GetObjectFile() == nullptr ||
module_sp->GetObjectFile() != &m_objfile)
return false;
return nullptr;

std::optional<FDEEntryMap::Entry> entry = GetFirstFDEEntryInRange(range);
if (!entry)
return false;
std::vector<AddressRange> valid_ranges;

std::optional<FDE> fde = ParseFDE(entry->data, addr);
if (!fde)
return false;

unwind_plan.SetSourceName(m_type == EH ? "eh_frame CFI" : "DWARF CFI");
auto result = std::make_unique<UnwindPlan>(GetRegisterKind());
result->SetSourceName(m_type == EH ? "eh_frame CFI" : "DWARF CFI");
// In theory the debug_frame info should be valid at all call sites
// ("asynchronous unwind info" as it is sometimes called) but in practice
// gcc et al all emit call frame info for the prologue and call sites, but
// not for the epilogue or all the other locations during the function
// reliably.
unwind_plan.SetUnwindPlanValidAtAllInstructions(eLazyBoolNo);
unwind_plan.SetSourcedFromCompiler(eLazyBoolYes);
unwind_plan.SetRegisterKind(GetRegisterKind());

unwind_plan.SetPlanValidAddressRanges({fde->range});
unwind_plan.SetUnwindPlanForSignalTrap(fde->for_signal_trap ? eLazyBoolYes
: eLazyBoolNo);
unwind_plan.SetReturnAddressRegister(fde->return_addr_reg_num);
int64_t slide =
fde->range.GetBaseAddress().GetFileAddress() - addr.GetFileAddress();
for (UnwindPlan::Row &row : fde->rows) {
row.SlideOffset(slide);
unwind_plan.AppendRow(std::move(row));
result->SetUnwindPlanValidAtAllInstructions(eLazyBoolNo);
result->SetSourcedFromCompiler(eLazyBoolYes);
result->SetUnwindPlanForSignalTrap(eLazyBoolNo);
for (const AddressRange &range : ranges) {
std::optional<FDEEntryMap::Entry> entry = GetFirstFDEEntryInRange(range);
if (!entry)
continue;
std::optional<FDE> fde = ParseFDE(entry->data, addr);
if (!fde)
continue;
int64_t slide =
fde->range.GetBaseAddress().GetFileAddress() - addr.GetFileAddress();
valid_ranges.push_back(std::move(fde->range));
if (fde->for_signal_trap)
result->SetUnwindPlanForSignalTrap(eLazyBoolYes);
result->SetReturnAddressRegister(fde->return_addr_reg_num);
for (UnwindPlan::Row &row : fde->rows) {
row.SlideOffset(slide);
result->AppendRow(std::move(row));
}
}

return true;
result->SetPlanValidAddressRanges(std::move(valid_ranges));
if (result->GetRowCount() == 0)
return nullptr;
return result;
}

bool DWARFCallFrameInfo::GetAddressRange(Address addr, AddressRange &range) {
Expand Down
21 changes: 7 additions & 14 deletions lldb/source/Symbol/FuncUnwinders.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -149,13 +149,9 @@ FuncUnwinders::GetEHFrameUnwindPlan(Target &target) {
return m_unwind_plan_eh_frame_sp;

m_tried_unwind_plan_eh_frame = true;
if (m_range.GetBaseAddress().IsValid()) {
DWARFCallFrameInfo *eh_frame = m_unwind_table.GetEHFrameInfo();
if (eh_frame) {
auto plan_sp = std::make_shared<UnwindPlan>(lldb::eRegisterKindGeneric);
if (eh_frame->GetUnwindPlan(m_range, *plan_sp))
m_unwind_plan_eh_frame_sp = std::move(plan_sp);
}
if (m_addr.IsValid()) {
if (DWARFCallFrameInfo *eh_frame = m_unwind_table.GetEHFrameInfo())
m_unwind_plan_eh_frame_sp = eh_frame->GetUnwindPlan(m_ranges, m_addr);
}
return m_unwind_plan_eh_frame_sp;
}
Expand All @@ -167,13 +163,10 @@ FuncUnwinders::GetDebugFrameUnwindPlan(Target &target) {
return m_unwind_plan_debug_frame_sp;

m_tried_unwind_plan_debug_frame = true;
if (m_range.GetBaseAddress().IsValid()) {
DWARFCallFrameInfo *debug_frame = m_unwind_table.GetDebugFrameInfo();
if (debug_frame) {
auto plan_sp = std::make_shared<UnwindPlan>(lldb::eRegisterKindGeneric);
if (debug_frame->GetUnwindPlan(m_range, *plan_sp))
m_unwind_plan_debug_frame_sp = std::move(plan_sp);
}
if (!m_ranges.empty()) {
if (DWARFCallFrameInfo *debug_frame = m_unwind_table.GetDebugFrameInfo())
m_unwind_plan_debug_frame_sp =
debug_frame->GetUnwindPlan(m_ranges, m_addr);
}
return m_unwind_plan_debug_frame_sp;
}
Expand Down
27 changes: 15 additions & 12 deletions lldb/source/Symbol/UnwindTable.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,13 @@ AddressRanges UnwindTable::GetAddressRanges(const Address &addr,
return {};
}

static Address GetFunctionOrSymbolAddress(const Address &addr,
const SymbolContext &sc) {
if (Address result = sc.GetFunctionOrSymbolAddress(); result.IsValid())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol at first I was like wait, does that actually work, but sure enough in C++17.

return result;
return addr;
}

FuncUnwindersSP
UnwindTable::GetFuncUnwindersContainingAddress(const Address &addr,
const SymbolContext &sc) {
Expand All @@ -131,25 +138,20 @@ UnwindTable::GetFuncUnwindersContainingAddress(const Address &addr,

// There is an UnwindTable per object file, so we can safely use file handles
addr_t file_addr = addr.GetFileAddress();
iterator end = m_unwinds.end();
iterator insert_pos = end;
if (!m_unwinds.empty()) {
insert_pos = m_unwinds.lower_bound(file_addr);
iterator pos = insert_pos;
if ((pos == m_unwinds.end()) ||
(pos != m_unwinds.begin() &&
pos->second->GetFunctionStartAddress() != addr))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized that checking for the start address is not correct here as we're storing the FuncUnwinders multiple times here -- once per address range.

--pos;

iterator insert_pos = m_unwinds.upper_bound(file_addr);
if (insert_pos != m_unwinds.begin()) {
auto pos = std::prev(insert_pos);
if (pos->second->ContainsAddress(addr))
return pos->second;
}

Address start_addr = GetFunctionOrSymbolAddress(addr, sc);
AddressRanges ranges = GetAddressRanges(addr, sc);
if (ranges.empty())
return nullptr;

auto func_unwinder_sp = std::make_shared<FuncUnwinders>(*this, addr, ranges);
auto func_unwinder_sp =
std::make_shared<FuncUnwinders>(*this, start_addr, ranges);
for (const AddressRange &range : ranges)
m_unwinds.emplace_hint(insert_pos, range.GetBaseAddress().GetFileAddress(),
func_unwinder_sp);
Expand All @@ -164,11 +166,12 @@ FuncUnwindersSP UnwindTable::GetUncachedFuncUnwindersContainingAddress(
const Address &addr, const SymbolContext &sc) {
Initialize();

Address start_addr = GetFunctionOrSymbolAddress(addr, sc);
AddressRanges ranges = GetAddressRanges(addr, sc);
if (ranges.empty())
return nullptr;

return std::make_shared<FuncUnwinders>(*this, addr, std::move(ranges));
return std::make_shared<FuncUnwinders>(*this, start_addr, std::move(ranges));
}

void UnwindTable::Dump(Stream &s) {
Expand Down
20 changes: 9 additions & 11 deletions lldb/source/Target/RegisterContextUnwind.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -868,13 +868,11 @@ RegisterContextUnwind::GetFullUnwindPlanForFrame() {

// Even with -fomit-frame-pointer, we can try eh_frame to get back on
// track.
DWARFCallFrameInfo *eh_frame =
pc_module_sp->GetUnwindTable().GetEHFrameInfo();
if (eh_frame) {
auto unwind_plan_sp =
std::make_shared<UnwindPlan>(lldb::eRegisterKindGeneric);
if (eh_frame->GetUnwindPlan(m_current_pc, *unwind_plan_sp))
return unwind_plan_sp;
if (DWARFCallFrameInfo *eh_frame =
pc_module_sp->GetUnwindTable().GetEHFrameInfo()) {
if (std::unique_ptr<UnwindPlan> plan_up =
eh_frame->GetUnwindPlan(m_current_pc))
return plan_up;
}

ArmUnwindInfo *arm_exidx =
Expand Down Expand Up @@ -1345,9 +1343,9 @@ RegisterContextUnwind::SavedLocationForRegister(
// value instead of the Return Address register.
// If $pc is not available, fall back to the RA reg.
UnwindPlan::Row::AbstractRegisterLocation scratch;
if (m_frame_type == eTrapHandlerFrame &&
active_row->GetRegisterInfo
(pc_regnum.GetAsKind (unwindplan_registerkind), scratch)) {
if (m_frame_type == eTrapHandlerFrame && active_row &&
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a splattering of defensive checks as it's easier to get a null active row now (some places around here are already checking it, but not all of them).

active_row->GetRegisterInfo(
pc_regnum.GetAsKind(unwindplan_registerkind), scratch)) {
UnwindLogMsg("Providing pc register instead of rewriting to "
"RA reg because this is a trap handler and there is "
"a location for the saved pc register value.");
Expand Down Expand Up @@ -1377,7 +1375,7 @@ RegisterContextUnwind::SavedLocationForRegister(
}
}

if (regnum.IsValid() &&
if (regnum.IsValid() && active_row &&
active_row->GetRegisterInfo(regnum.GetAsKind(unwindplan_registerkind),
unwindplan_regloc)) {
have_unwindplan_regloc = true;
Expand Down
54 changes: 30 additions & 24 deletions lldb/test/Shell/Unwind/Inputs/basic-block-sections-with-dwarf.s
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@
# int bar() { return foo(0); }
# int foo(int flag) { return flag ? bar() : baz(); }
# int main() { return foo(1); }
# The function bar has been placed "in the middle" of foo.
# The function bar has been placed "in the middle" of foo. The functions are not
# using the frame pointer register and the are deliberately adjusting the stack
# pointer to test that we're using the correct unwind row.

.text

Expand All @@ -20,35 +22,36 @@ baz:
.type foo,@function
foo:
.cfi_startproc
pushq %rbp
pushq %rbx
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
subq $16, %rsp
movl %edi, -8(%rbp)
cmpl $0, -8(%rbp)
.cfi_offset %rbx, -16
movl %edi, %ebx
cmpl $0, %ebx
je foo.__part.2
jmp foo.__part.1
.cfi_endproc
.Lfoo_end:
.size foo, .Lfoo_end-foo

# NB: Deliberately inserting padding to separate the two parts of the function
# as we're currently only parsing a single FDE entry from a (coalesced) address
# range.
nop

foo.__part.1:
.cfi_startproc
.cfi_def_cfa %rbp, 16
.cfi_offset %rbp, -16
.cfi_def_cfa_offset 16
.cfi_offset %rbx, -16
subq $16, %rsp
.cfi_def_cfa_offset 32
callq bar
movl %eax, -4(%rbp)
jmp foo.__part.3
.Lfoo.__part.1_end:
.size foo.__part.1, .Lfoo.__part.1_end-foo.__part.1
.cfi_endproc

bar:
.cfi_startproc
# NB: Decrease the stack pointer to make the unwind info for this function
# different from the surrounding foo function.
subq $24, %rsp
.cfi_def_cfa_offset 32
xorl %edi, %edi
Expand All @@ -62,22 +65,26 @@ bar:

foo.__part.2:
.cfi_startproc
.cfi_def_cfa %rbp, 16
.cfi_offset %rbp, -16
.cfi_def_cfa_offset 16
.cfi_offset %rbx, -16
subq $16, %rsp
.cfi_def_cfa_offset 32
callq baz
movl %eax, -4(%rbp)
jmp foo.__part.3
.Lfoo.__part.2_end:
.size foo.__part.2, .Lfoo.__part.2_end-foo.__part.2
.cfi_endproc

# NB: Deliberately inserting padding to separate the two parts of the function
# as we're currently only parsing a single FDE entry from a (coalesced) address
# range.
nop

foo.__part.3:
.cfi_startproc
.cfi_def_cfa %rbp, 16
.cfi_offset %rbp, -16
movl -4(%rbp), %eax
addq $16, %rsp
popq %rbp
.cfi_def_cfa_offset 32
.cfi_offset %rbx, -16
addq $24, %rsp
.cfi_def_cfa %rsp, 8
retq
.Lfoo.__part.3_end:
Expand Down Expand Up @@ -186,9 +193,8 @@ main:
.byte 86
.asciz "foo" # DW_AT_name
.byte 4 # Abbrev [4] DW_TAG_formal_parameter
.byte 2 # DW_AT_location
.byte 145
.byte 120
.byte 1 # DW_AT_location
.byte 0x53 # DW_OP_reg3
.asciz "flag" # DW_AT_name
.long .Lint-.Lcu_begin0 # DW_AT_type
.byte 0 # End Of Children Mark
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,23 +17,32 @@
image show-unwind --cached true -n foo
# CHECK: UNWIND PLANS for {{.*}}`foo
#
# CHECK: Assembly language inspection UnwindPlan:
# CHECK-NEXT: This UnwindPlan originally sourced from assembly insn profiling
# CHECK-NEXT: This UnwindPlan is sourced from the compiler: no.
# CHECK-NEXT: This UnwindPlan is valid at all instruction locations: yes.
# CHECK: eh_frame UnwindPlan:
# CHECK-NEXT: This UnwindPlan originally sourced from eh_frame CFI
# CHECK-NEXT: This UnwindPlan is sourced from the compiler: yes.
# CHECK-NEXT: This UnwindPlan is valid at all instruction locations: no.
# CHECK-NEXT: This UnwindPlan is for a trap handler function: no.
# TODO: This address range isn't correct right now. We're just checking that
# it's a different range from the one in the next query.
# CHECK-NEXT: Address range of this UnwindPlan: [{{.*}}.text + 6-0x0000000000000046)
# CHECK-NEXT: Address range of this UnwindPlan: [{{.*}}.text + 6-0x0000000000000010)[{{.*}}.text + 17-0x000000000000001c)[{{.*}}.text + 44-0x0000000000000037)[{{.*}}.text + 56-0x000000000000003d)
# CHECK-NEXT: row[0]: 0: CFA=rsp +8 => rip=[CFA-8]
# CHECK-NEXT: row[1]: 1: CFA=rsp+16 => rbx=[CFA-16] rip=[CFA-8]
# CHECK-NEXT: row[2]: 11: CFA=rsp+16 => rbx=[CFA-16] rip=[CFA-8]
# CHECK-NEXT: row[3]: 15: CFA=rsp+32 => rbx=[CFA-16] rip=[CFA-8]
# CHECK-NEXT: row[4]: 38: CFA=rsp+16 => rbx=[CFA-16] rip=[CFA-8]
# CHECK-NEXT: row[5]: 42: CFA=rsp+32 => rbx=[CFA-16] rip=[CFA-8]
# CHECK-NEXT: row[6]: 50: CFA=rsp+32 => rbx=[CFA-16] rip=[CFA-8]
# CHECK-NEXT: row[7]: 54: CFA=rsp +8 => rbx=[CFA-16] rip=[CFA-8]
# CHECK-EMPTY:

image show-unwind --cached true -n bar
# CHECK: UNWIND PLANS for {{.*}}`bar

# CHECK: Assembly language inspection UnwindPlan:
# CHECK-NEXT: This UnwindPlan originally sourced from assembly insn profiling
# CHECK-NEXT: This UnwindPlan is sourced from the compiler: no.
# CHECK-NEXT: This UnwindPlan is valid at all instruction locations: yes.
# CHECK: eh_frame UnwindPlan:
# CHECK-NEXT: This UnwindPlan originally sourced from eh_frame CFI
# CHECK-NEXT: This UnwindPlan is sourced from the compiler: yes.
# CHECK-NEXT: This UnwindPlan is valid at all instruction locations: no.
# CHECK-NEXT: This UnwindPlan is for a trap handler function: no.
# TODO: This address range isn't correct right now. We're just checking that
# it's a different range from the one in the previous query.
# CHECK-NEXT: Address range of this UnwindPlan: [{{.*}}.text + 35-0x0000000000000033)
# CHECK-NEXT: Address range of this UnwindPlan: [{{.*}}.text + 28-0x000000000000002c)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obviously not relevant to this PR but man Address range of this UnwindPlan: [{{.*}}.text + 28-0x000000000000002c is ugly. I wonder if AddressRange::Dump should call DumpAddress(s->AsRawOstream(), m_base_addr.GetOffset() + GetByteSize(), 1); (1 instead of addr_size as it does today) for DumpStyleSectionNameOffset - so it's not address-size zero padded - but I'm sure that would break some other tabular printing elsewhere, so nevermind.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's quite a mess. I've been planning to look into that.

# CHECK-NEXT: row[0]: 0: CFA=rsp +8 => rip=[CFA-8]
# CHECK-NEXT: row[1]: 4: CFA=rsp+32 => rip=[CFA-8]
# CHECK-NEXT: row[2]: 15: CFA=rsp +8 => rip=[CFA-8]
# CHECK-EMPTY:
Loading
Loading