Skip to content

Commit eee3090

Browse files
committed
Fix .debug_aranges parsing issues.
When LLVM error handling was introduced to the parsing of the .debug_aranges it would cause major issues if any DWARFDebugArangeSet::extract() calls returned any errors. The code in DWARFDebugInfo::GetCompileUnitAranges() would end up calling DWARFDebugAranges::extract() which would return an error if _any_ DWARFDebugArangeSet had any errors, but it default constructed a DWARFDebugAranges object into DWARFDebugInfo::m_cu_aranges_up and populated it partially, and returned an error prior to finishing much needed functionality in the DWARFDebugInfo::GetCompileUnitAranges() function. Subsequent callers to this function would see that the DWARFDebugInfo::m_cu_aranges_up was actually valid and return this partially populated DWARFDebugAranges reference _and_ it would not be sorted or minimized. This above bugs would cause an incomplete .debug_aranges parsing, it would skip manually parsing any compile units for ranges, and would not sort the DWARFDebugAranges in m_cu_aranges_up. This bug would also cause breakpoints set by file and line to fail to set correctly if a symbol context for an address could not be resolved properly, which the incomplete and unsorted DWARFDebugAranges object that DWARFDebugInfo::GetCompileUnitAranges() returned would cause symbol context lookups resolved by address (breakpoint address) to fail to find any DWARF debug info for a given address. This patch fixes all of the issues that I found: - DWARFDebugInfo::GetCompileUnitAranges() no longer returns a "llvm::Expected<DWARFDebugAranges &>", but just returns a "const DWARFDebugAranges &". Why? Because this code contained a fallback that would parse all of the valid DWARFDebugArangeSet objects, and would check which compile units had valid .debug_aranges set entries, and manually build an address ranges table using DWARFUnit::BuildAddressRangeTable(). If we return an error because any DWARFDebugArangeSet has any errors, then we don't do any of this code. Now we parse all DWARFDebugArangeSet objects that have no errors, if any calls to DWARFDebugArangeSet::extract() return errors, we skip that DWARFDebugArangeSet so that we can use the fallback call to DWARFUnit::BuildAddressRangeTable(). Since DWARFDebugInfo::GetCompileUnitAranges() needs to parse what it can from the .debug_aranges and build address ranges tables for any compile units that don't have any .debug_aranges sets, everything now works as expected. - Fix an issue where a DWARFDebugArangeSet contains multiple terminator entries. The LLVM parser and llvm-dwarfdump properly warn about this because it happens with linux compilers and linkers and was the original cause of the bug I am fixing here. We now correctly warn about this issue if "log enable dwarf info" is enabled, but we continue to parse the DWARFDebugArangeSet correctly so we don't lose data that is contained in the .debug_aranges section. - DWARFDebugAranges::extract() no longer returns a llvm::Error because we need to be able to parse all of the valid DWARFDebugArangeSet objects. It also will correctly skip a DWARFDebugArangeSet object that has errors in the middle of the stream by setting the start offsets of each DWARFDebugArangeSet to be calculated by the previous DWARFDebugArangeSet::extract() calculated offset that uses the header which contains the length of the DWARFDebugArangeSet. This means if do we run into real errors while parsing individual DWARFDebugArangeSet objects, we can continue to parse the rest of the validly encoded DWARFDebugArangeSet objects in the .debug_aranges section. This will allow LLDB to parse DWARF that contains a possibly newer .debug_aranges set format than LLDB currently supports because we will error out for the parsing of the DWARFDebugArangeSet, but be able to skip to the next DWARFDebugArangeSet object using the "DWARFDebugArangeSet.m_header.length" field to calculate the next starting offset. Tests were added to cover all new functionality. Differential Revision: https://reviews.llvm.org/D99401
1 parent b75018e commit eee3090

File tree

8 files changed

+245
-67
lines changed

8 files changed

+245
-67
lines changed

lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugArangeSet.cpp

Lines changed: 43 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -8,22 +8,18 @@
88

99
#include "DWARFDebugArangeSet.h"
1010
#include "DWARFDataExtractor.h"
11+
#include "LogChannelDWARF.h"
1112
#include "llvm/Object/Error.h"
1213
#include <cassert>
1314

1415
using namespace lldb_private;
1516

1617
DWARFDebugArangeSet::DWARFDebugArangeSet()
17-
: m_offset(DW_INVALID_OFFSET), m_header(), m_arange_descriptors() {
18-
m_header.length = 0;
19-
m_header.version = 0;
20-
m_header.cu_offset = 0;
21-
m_header.addr_size = 0;
22-
m_header.seg_size = 0;
23-
}
18+
: m_offset(DW_INVALID_OFFSET), m_next_offset(DW_INVALID_OFFSET) {}
2419

2520
void DWARFDebugArangeSet::Clear() {
2621
m_offset = DW_INVALID_OFFSET;
22+
m_next_offset = DW_INVALID_OFFSET;
2723
m_header.length = 0;
2824
m_header.version = 0;
2925
m_header.cu_offset = 0;
@@ -54,6 +50,12 @@ llvm::Error DWARFDebugArangeSet::extract(const DWARFDataExtractor &data,
5450
// consists of an address and a length, each in the size appropriate for an
5551
// address on the target architecture.
5652
m_header.length = data.GetDWARFInitialLength(offset_ptr);
53+
// The length could be 4 bytes or 12 bytes, so use the current offset to
54+
// determine the next offset correctly.
55+
if (m_header.length > 0)
56+
m_next_offset = *offset_ptr + m_header.length;
57+
else
58+
m_next_offset = DW_INVALID_OFFSET;
5759
m_header.version = data.GetU16(offset_ptr);
5860
m_header.cu_offset = data.GetDWARFOffset(offset_ptr);
5961
m_header.addr_size = data.GetU8(offset_ptr);
@@ -105,17 +107,45 @@ llvm::Error DWARFDebugArangeSet::extract(const DWARFDataExtractor &data,
105107
"DWARFDebugArangeSet::Descriptor.address and "
106108
"DWARFDebugArangeSet::Descriptor.length must have same size");
107109

108-
while (data.ValidOffset(*offset_ptr)) {
110+
const lldb::offset_t next_offset = GetNextOffset();
111+
assert(next_offset != DW_INVALID_OFFSET);
112+
uint32_t num_terminators = 0;
113+
bool last_was_terminator = false;
114+
while (*offset_ptr < next_offset) {
109115
arangeDescriptor.address = data.GetMaxU64(offset_ptr, m_header.addr_size);
110116
arangeDescriptor.length = data.GetMaxU64(offset_ptr, m_header.addr_size);
111117

112118
// Each set of tuples is terminated by a 0 for the address and 0 for
113-
// the length.
114-
if (!arangeDescriptor.address && !arangeDescriptor.length)
115-
return llvm::ErrorSuccess();
116-
117-
m_arange_descriptors.push_back(arangeDescriptor);
119+
// the length. Some linkers can emit .debug_aranges with multiple
120+
// terminator pair entries that are still withing the length of the
121+
// DWARFDebugArangeSet. We want to be sure to parse all entries for
122+
// this DWARFDebugArangeSet so that we don't stop parsing early and end up
123+
// treating addresses as a header of the next DWARFDebugArangeSet. We also
124+
// need to make sure we parse all valid address pairs so we don't omit them
125+
// from the aranges result, so we can't stop at the first terminator entry
126+
// we find.
127+
if (arangeDescriptor.address == 0 && arangeDescriptor.length == 0) {
128+
++num_terminators;
129+
last_was_terminator = true;
130+
} else {
131+
last_was_terminator = false;
132+
// Only add .debug_aranges address entries that have a non zero size.
133+
// Some linkers will zero out the length field for some .debug_aranges
134+
// entries if they were stripped. We also could watch out for multiple
135+
// entries at address zero and remove those as well.
136+
if (arangeDescriptor.length > 0)
137+
m_arange_descriptors.push_back(arangeDescriptor);
138+
}
139+
}
140+
if (num_terminators > 1) {
141+
Log *log = LogChannelDWARF::GetLogIfAll(DWARF_LOG_DEBUG_INFO);
142+
LLDB_LOG(log,
143+
"warning: DWARFDebugArangeSet at %#" PRIx64 " contains %u "
144+
"terminator entries",
145+
m_offset, num_terminators);
118146
}
147+
if (last_was_terminator)
148+
return llvm::ErrorSuccess();
119149

120150
return llvm::make_error<llvm::object::GenericBinaryError>(
121151
"arange descriptors not terminated by null entry");

lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugArangeSet.h

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -16,18 +16,21 @@
1616
class DWARFDebugArangeSet {
1717
public:
1818
struct Header {
19-
uint32_t length; // The total length of the entries for that set, not
20-
// including the length field itself.
21-
uint16_t version; // The DWARF version number
22-
uint32_t cu_offset; // The offset from the beginning of the .debug_info
23-
// section of the compilation unit entry referenced by
24-
// the table.
25-
uint8_t addr_size; // The size in bytes of an address on the target
26-
// architecture. For segmented addressing, this is the
27-
// size of the offset portion of the address
28-
uint8_t seg_size; // The size in bytes of a segment descriptor on the target
29-
// architecture. If the target system uses a flat address
30-
// space, this value is 0.
19+
/// The total length of the entries for that set, not including the length
20+
/// field itself.
21+
uint32_t length = 0;
22+
/// The DWARF version number.
23+
uint16_t version = 0;
24+
/// The offset from the beginning of the .debug_info section of the
25+
/// compilation unit entry referenced by the table.
26+
uint32_t cu_offset = 0;
27+
/// The size in bytes of an address on the target architecture. For
28+
/// segmented addressing, this is the size of the offset portion of the
29+
/// address.
30+
uint8_t addr_size = 0;
31+
/// The size in bytes of a segment descriptor on the target architecture.
32+
/// If the target system uses a flat address space, this value is 0.
33+
uint8_t seg_size = 0;
3134
};
3235

3336
struct Descriptor {
@@ -44,7 +47,7 @@ class DWARFDebugArangeSet {
4447
dw_offset_t FindAddress(dw_addr_t address) const;
4548
size_t NumDescriptors() const { return m_arange_descriptors.size(); }
4649
const Header &GetHeader() const { return m_header; }
47-
50+
dw_offset_t GetNextOffset() const { return m_next_offset; }
4851
const Descriptor &GetDescriptorRef(uint32_t i) const {
4952
return m_arange_descriptors[i];
5053
}
@@ -54,7 +57,8 @@ class DWARFDebugArangeSet {
5457
typedef DescriptorColl::iterator DescriptorIter;
5558
typedef DescriptorColl::const_iterator DescriptorConstIter;
5659

57-
uint32_t m_offset;
60+
dw_offset_t m_offset;
61+
dw_offset_t m_next_offset;
5862
Header m_header;
5963
DescriptorColl m_arange_descriptors;
6064
};

lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugAranges.cpp

Lines changed: 25 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
#include "DWARFDebugAranges.h"
1010
#include "DWARFDebugArangeSet.h"
1111
#include "DWARFUnit.h"
12+
#include "LogChannelDWARF.h"
1213
#include "lldb/Utility/Log.h"
1314
#include "lldb/Utility/Timer.h"
1415

@@ -31,31 +32,40 @@ class CountArangeDescriptors {
3132
};
3233

3334
// Extract
34-
llvm::Error
35-
DWARFDebugAranges::extract(const DWARFDataExtractor &debug_aranges_data) {
35+
void DWARFDebugAranges::extract(const DWARFDataExtractor &debug_aranges_data) {
3636
lldb::offset_t offset = 0;
3737

3838
DWARFDebugArangeSet set;
3939
Range range;
4040
while (debug_aranges_data.ValidOffset(offset)) {
41-
llvm::Error error = set.extract(debug_aranges_data, &offset);
42-
if (error)
43-
return error;
41+
const lldb::offset_t set_offset = offset;
42+
if (llvm::Error error = set.extract(debug_aranges_data, &offset)) {
43+
Log *log = LogChannelDWARF::GetLogIfAll(DWARF_LOG_DEBUG_INFO);
44+
LLDB_LOG_ERROR(log, std::move(error),
45+
"DWARFDebugAranges::extract failed to extract "
46+
".debug_aranges set at offset %#" PRIx64,
47+
set_offset);
48+
} else {
49+
const uint32_t num_descriptors = set.NumDescriptors();
50+
if (num_descriptors > 0) {
51+
const dw_offset_t cu_offset = set.GetHeader().cu_offset;
4452

45-
const uint32_t num_descriptors = set.NumDescriptors();
46-
if (num_descriptors > 0) {
47-
const dw_offset_t cu_offset = set.GetHeader().cu_offset;
48-
49-
for (uint32_t i = 0; i < num_descriptors; ++i) {
50-
const DWARFDebugArangeSet::Descriptor &descriptor =
51-
set.GetDescriptorRef(i);
52-
m_aranges.Append(RangeToDIE::Entry(descriptor.address,
53-
descriptor.length, cu_offset));
53+
for (uint32_t i = 0; i < num_descriptors; ++i) {
54+
const DWARFDebugArangeSet::Descriptor &descriptor =
55+
set.GetDescriptorRef(i);
56+
m_aranges.Append(RangeToDIE::Entry(descriptor.address,
57+
descriptor.length, cu_offset));
58+
}
5459
}
5560
}
61+
// Always use the previous DWARFDebugArangeSet's information to calculate
62+
// the offset of the next DWARFDebugArangeSet in case we entouncter an
63+
// error in the current DWARFDebugArangeSet and our offset position is
64+
// still in the middle of the data. If we do this, we can parse all valid
65+
// DWARFDebugArangeSet objects without returning invalid errors.
66+
offset = set.GetNextOffset();
5667
set.Clear();
5768
}
58-
return llvm::ErrorSuccess();
5969
}
6070

6171
void DWARFDebugAranges::Dump(Log *log) const {

lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugAranges.h

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,7 @@ class DWARFDebugAranges {
2626

2727
void Clear() { m_aranges.Clear(); }
2828

29-
llvm::Error
30-
extract(const lldb_private::DWARFDataExtractor &debug_aranges_data);
29+
void extract(const lldb_private::DWARFDataExtractor &debug_aranges_data);
3130

3231
// Use append range multiple times and then call sort
3332
void AppendRange(dw_offset_t cu_offset, dw_addr_t low_pc, dw_addr_t high_pc);

lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,26 +34,26 @@ DWARFDebugInfo::DWARFDebugInfo(SymbolFileDWARF &dwarf,
3434
lldb_private::DWARFContext &context)
3535
: m_dwarf(dwarf), m_context(context), m_units(), m_cu_aranges_up() {}
3636

37-
llvm::Expected<DWARFDebugAranges &> DWARFDebugInfo::GetCompileUnitAranges() {
37+
const DWARFDebugAranges &DWARFDebugInfo::GetCompileUnitAranges() {
3838
if (m_cu_aranges_up)
3939
return *m_cu_aranges_up;
4040

4141
m_cu_aranges_up = std::make_unique<DWARFDebugAranges>();
4242
const DWARFDataExtractor &debug_aranges_data =
4343
m_context.getOrLoadArangesData();
44-
if (llvm::Error error = m_cu_aranges_up->extract(debug_aranges_data))
45-
return std::move(error);
4644

47-
// Make a list of all CUs represented by the arange data in the file.
45+
// Extract what we can from the .debug_aranges first.
46+
m_cu_aranges_up->extract(debug_aranges_data);
47+
48+
// Make a list of all CUs represented by the .debug_aranges data.
4849
std::set<dw_offset_t> cus_with_data;
4950
for (size_t n = 0; n < m_cu_aranges_up->GetNumRanges(); n++) {
5051
dw_offset_t offset = m_cu_aranges_up->OffsetAtIndex(n);
5152
if (offset != DW_INVALID_OFFSET)
5253
cus_with_data.insert(offset);
5354
}
5455

55-
// Manually build arange data for everything that wasn't in the
56-
// .debug_aranges table.
56+
// Manually build arange data for everything that wasn't in .debug_aranges.
5757
const size_t num_units = GetNumUnits();
5858
for (size_t idx = 0; idx < num_units; ++idx) {
5959
DWARFUnit *cu = GetUnitAtIndex(idx);

lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ class DWARFDebugInfo {
5454
(1 << 2) // Show all parent DIEs when dumping single DIEs
5555
};
5656

57-
llvm::Expected<DWARFDebugAranges &> GetCompileUnitAranges();
57+
const DWARFDebugAranges &GetCompileUnitAranges();
5858

5959
protected:
6060
typedef std::vector<DWARFUnitSP> UnitColl;

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp

Lines changed: 2 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1862,17 +1862,8 @@ uint32_t SymbolFileDWARF::ResolveSymbolContext(const Address &so_addr,
18621862
lldb::addr_t file_vm_addr = so_addr.GetFileAddress();
18631863

18641864
DWARFDebugInfo &debug_info = DebugInfo();
1865-
llvm::Expected<DWARFDebugAranges &> aranges =
1866-
debug_info.GetCompileUnitAranges();
1867-
if (!aranges) {
1868-
Log *log = LogChannelDWARF::GetLogIfAll(DWARF_LOG_DEBUG_INFO);
1869-
LLDB_LOG_ERROR(log, aranges.takeError(),
1870-
"SymbolFileDWARF::ResolveSymbolContext failed to get cu "
1871-
"aranges. {0}");
1872-
return 0;
1873-
}
1874-
1875-
const dw_offset_t cu_offset = aranges->FindAddress(file_vm_addr);
1865+
const DWARFDebugAranges &aranges = debug_info.GetCompileUnitAranges();
1866+
const dw_offset_t cu_offset = aranges.FindAddress(file_vm_addr);
18761867
if (cu_offset == DW_INVALID_OFFSET) {
18771868
// Global variables are not in the compile unit address ranges. The only
18781869
// way to currently find global variables is to iterate over the

0 commit comments

Comments
 (0)