Skip to content

Commit 15de77d

Browse files
authored
[lldb] (Prepare to) speed up dwarf indexing (#118657)
Indexing a single DWARF unit is a fairly small task, which means the overhead of enqueueing a task for each unit is not negligible (mainly because introduces a lot of synchronization points for queue management, memory allocation etc.). This is particularly true if the binary was built with type units, as these are usually very small. This essentially brings us back to the state before https://reviews.llvm.org/D78337, but the new implementation is built on the llvm ThreadPool, and I've added a small improvement -- we now construct one "index set" per thread instead of one per unit, which should lower the memory usage (fewer small allocations) and make the subsequent merge step faster. On its own this patch doesn't actually change the performance characteristics because we still have one choke point -- progress reporting. I'm leaving that for a separate patch, but I've tried that simply removing the progress reporting gives us about a 30-60% speed boost.
1 parent 6caf9f8 commit 15de77d

File tree

1 file changed

+42
-32
lines changed

1 file changed

+42
-32
lines changed

lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp

Lines changed: 42 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
#include "lldb/Utility/Timer.h"
2424
#include "llvm/Support/FormatVariadic.h"
2525
#include "llvm/Support/ThreadPool.h"
26+
#include <atomic>
2627
#include <optional>
2728

2829
using namespace lldb_private;
@@ -81,44 +82,53 @@ void ManualDWARFIndex::Index() {
8182
Progress progress("Manually indexing DWARF", module_desc.GetData(),
8283
total_progress);
8384

84-
std::vector<IndexSet> sets(units_to_index.size());
85-
86-
// Keep memory down by clearing DIEs for any units if indexing
87-
// caused us to load the unit's DIEs.
88-
std::vector<std::optional<DWARFUnit::ScopedExtractDIEs>> clear_cu_dies(
89-
units_to_index.size());
90-
auto parser_fn = [&](size_t cu_idx) {
91-
IndexUnit(*units_to_index[cu_idx], dwp_dwarf, sets[cu_idx]);
92-
progress.Increment();
93-
};
94-
95-
auto extract_fn = [&](size_t cu_idx) {
96-
clear_cu_dies[cu_idx] = units_to_index[cu_idx]->ExtractDIEsScoped();
97-
progress.Increment();
98-
};
99-
10085
// Share one thread pool across operations to avoid the overhead of
10186
// recreating the threads.
10287
llvm::ThreadPoolTaskGroup task_group(Debugger::GetThreadPool());
88+
const size_t num_threads = Debugger::GetThreadPool().getMaxConcurrency();
89+
90+
// Run a function for each compile unit in parallel using as many threads as
91+
// are available. This is significantly faster than submiting a new task for
92+
// each unit.
93+
auto for_each_unit = [&](auto &&fn) {
94+
std::atomic<size_t> next_cu_idx = 0;
95+
auto wrapper = [&fn, &next_cu_idx, &units_to_index,
96+
&progress](size_t worker_id) {
97+
size_t cu_idx;
98+
while ((cu_idx = next_cu_idx.fetch_add(1, std::memory_order_relaxed)) <
99+
units_to_index.size()) {
100+
fn(worker_id, cu_idx, units_to_index[cu_idx]);
101+
progress.Increment();
102+
}
103+
};
103104

104-
// Create a task runner that extracts dies for each DWARF unit in a
105-
// separate thread.
106-
// First figure out which units didn't have their DIEs already
107-
// parsed and remember this. If no DIEs were parsed prior to this index
108-
// function call, we are going to want to clear the CU dies after we are
109-
// done indexing to make sure we don't pull in all DWARF dies, but we need
110-
// to wait until all units have been indexed in case a DIE in one
111-
// unit refers to another and the indexes accesses those DIEs.
112-
for (size_t i = 0; i < units_to_index.size(); ++i)
113-
task_group.async(extract_fn, i);
114-
task_group.wait();
105+
for (size_t i = 0; i < num_threads; ++i)
106+
task_group.async(wrapper, i);
115107

116-
// Now create a task runner that can index each DWARF unit in a
117-
// separate thread so we can index quickly.
118-
for (size_t i = 0; i < units_to_index.size(); ++i)
119-
task_group.async(parser_fn, i);
120-
task_group.wait();
108+
task_group.wait();
109+
};
121110

111+
// Extract dies for all DWARFs unit in parallel. Figure out which units
112+
// didn't have their DIEs already parsed and remember this. If no DIEs were
113+
// parsed prior to this index function call, we are going to want to clear the
114+
// CU dies after we are done indexing to make sure we don't pull in all DWARF
115+
// dies, but we need to wait until all units have been indexed in case a DIE
116+
// in one unit refers to another and the indexes accesses those DIEs.
117+
std::vector<std::optional<DWARFUnit::ScopedExtractDIEs>> clear_cu_dies(
118+
units_to_index.size());
119+
for_each_unit([&clear_cu_dies](size_t, size_t idx, DWARFUnit *unit) {
120+
clear_cu_dies[idx] = unit->ExtractDIEsScoped();
121+
});
122+
123+
// Now index all DWARF unit in parallel.
124+
std::vector<IndexSet> sets(num_threads);
125+
for_each_unit(
126+
[this, dwp_dwarf, &sets](size_t worker_id, size_t, DWARFUnit *unit) {
127+
IndexUnit(*unit, dwp_dwarf, sets[worker_id]);
128+
});
129+
130+
// Merge partial indexes into a single index. Process each index in a set in
131+
// parallel.
122132
auto finalize_fn = [this, &sets, &progress](NameToDIE(IndexSet::*index)) {
123133
NameToDIE &result = m_set.*index;
124134
for (auto &set : sets)

0 commit comments

Comments
 (0)