Skip to content

[Parallel] Revert sequential task changes #109084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 35 additions & 21 deletions lld/ELF/Relocations.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1647,30 +1647,44 @@ template <class ELFT> void elf::scanRelocations() {
bool serial = !config->zCombreloc || config->emachine == EM_MIPS ||
config->emachine == EM_PPC64;
parallel::TaskGroup tg;
for (ELFFileBase *f : ctx.objectFiles) {
auto fn = [f]() {
auto outerFn = [&]() {
for (ELFFileBase *f : ctx.objectFiles) {
auto fn = [f]() {
RelocationScanner scanner;
for (InputSectionBase *s : f->getSections()) {
if (s && s->kind() == SectionBase::Regular && s->isLive() &&
(s->flags & SHF_ALLOC) &&
!(s->type == SHT_ARM_EXIDX && config->emachine == EM_ARM))
scanner.template scanSection<ELFT>(*s);
}
};
if (serial)
fn();
else
tg.spawn(fn);
}
auto scanEH = [] {
RelocationScanner scanner;
for (InputSectionBase *s : f->getSections()) {
if (s && s->kind() == SectionBase::Regular && s->isLive() &&
(s->flags & SHF_ALLOC) &&
!(s->type == SHT_ARM_EXIDX && config->emachine == EM_ARM))
scanner.template scanSection<ELFT>(*s);
for (Partition &part : ctx.partitions) {
for (EhInputSection *sec : part.ehFrame->sections)
scanner.template scanSection<ELFT>(*sec, /*isEH=*/true);
if (part.armExidx && part.armExidx->isLive())
for (InputSection *sec : part.armExidx->exidxSections)
if (sec->isLive())
scanner.template scanSection<ELFT>(*sec);
}
};
tg.spawn(fn, serial);
}

tg.spawn([] {
RelocationScanner scanner;
for (Partition &part : ctx.partitions) {
for (EhInputSection *sec : part.ehFrame->sections)
scanner.template scanSection<ELFT>(*sec, /*isEH=*/true);
if (part.armExidx && part.armExidx->isLive())
for (InputSection *sec : part.armExidx->exidxSections)
if (sec->isLive())
scanner.template scanSection<ELFT>(*sec);
}
});
if (serial)
scanEH();
else
tg.spawn(scanEH);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to run scanEH(); on main thread here? If it uses threadIndex then it will receive UINT_MAX which is unwanted. On the another hand scanEH contains whole loop and then all scanSection would be done in deterministic order. It looks like we can replace whole "if(serial)" with single "tg.spawn(scanEH);" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to run scanEH in a worker thread to avoid UINT_MAX threadIndex.

The if (serial) scanEH(); else tg.spawn(scanEH); pattern is similar to if (serial) fn(); else tg.spawn(fn); .

scanEH scans .eh_frame in all input files. .eh_frame relocations are smaller than other sections, so the previous code scans all .eh_frame in one task.

};
// If `serial` is true, call `spawn` to ensure that `scanner` runs in a thread
// with valid getThreadIndex().
if (serial)
tg.spawn(outerFn);
else
outerFn();
}

static bool handleNonPreemptibleIfunc(Symbol &sym, uint16_t flags) {
Expand Down
4 changes: 1 addition & 3 deletions llvm/include/llvm/Support/Parallel.h
Original file line number Diff line number Diff line change
Expand Up @@ -97,9 +97,7 @@ class TaskGroup {
// Spawn a task, but does not wait for it to finish.
// Tasks marked with \p Sequential will be executed
// exactly in the order which they were spawned.
// Note: Sequential tasks may be executed on different
// threads, but strictly in sequential order.
void spawn(std::function<void()> f, bool Sequential = false);
void spawn(std::function<void()> f);

void sync() const { L.sync(); }

Expand Down
49 changes: 12 additions & 37 deletions llvm/lib/Support/Parallel.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
#include "llvm/Support/Threading.h"

#include <atomic>
#include <deque>
#include <future>
#include <thread>
#include <vector>
Expand All @@ -39,7 +38,7 @@ namespace {
class Executor {
public:
virtual ~Executor() = default;
virtual void add(std::function<void()> func, bool Sequential = false) = 0;
virtual void add(std::function<void()> func) = 0;
virtual size_t getThreadCount() const = 0;

static Executor *getDefaultExecutor();
Expand Down Expand Up @@ -98,56 +97,34 @@ class ThreadPoolExecutor : public Executor {
static void call(void *Ptr) { ((ThreadPoolExecutor *)Ptr)->stop(); }
};

void add(std::function<void()> F, bool Sequential = false) override {
void add(std::function<void()> F) override {
{
std::lock_guard<std::mutex> Lock(Mutex);
if (Sequential)
WorkQueueSequential.emplace_front(std::move(F));
else
WorkQueue.emplace_back(std::move(F));
WorkStack.push_back(std::move(F));
}
Cond.notify_one();
}

size_t getThreadCount() const override { return ThreadCount; }

private:
bool hasSequentialTasks() const {
return !WorkQueueSequential.empty() && !SequentialQueueIsLocked;
}

bool hasGeneralTasks() const { return !WorkQueue.empty(); }

void work(ThreadPoolStrategy S, unsigned ThreadID) {
threadIndex = ThreadID;
S.apply_thread_strategy(ThreadID);
while (true) {
std::unique_lock<std::mutex> Lock(Mutex);
Cond.wait(Lock, [&] {
return Stop || hasGeneralTasks() || hasSequentialTasks();
});
Cond.wait(Lock, [&] { return Stop || !WorkStack.empty(); });
if (Stop)
break;
bool Sequential = hasSequentialTasks();
if (Sequential)
SequentialQueueIsLocked = true;
else
assert(hasGeneralTasks());

auto &Queue = Sequential ? WorkQueueSequential : WorkQueue;
auto Task = std::move(Queue.back());
Queue.pop_back();
auto Task = std::move(WorkStack.back());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was missed in original code, but it seems that the check for WorkStack should be added here:

if (WorkStack.empty()) {
Lock.unlock();
continue;
}

Otherwise, WorkStack.back() might be called on empty WorkStack in case spurious wakeup?

Copy link
Member Author

@MaskRay MaskRay Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code is fine. The condition_variable should prevent spurious wakeup.

Also, unique_lock::~unique_lock releases ownership if owned.

WorkStack.pop_back();
Lock.unlock();
Task();
if (Sequential)
SequentialQueueIsLocked = false;
}
}

std::atomic<bool> Stop{false};
std::atomic<bool> SequentialQueueIsLocked{false};
std::deque<std::function<void()>> WorkQueue;
std::deque<std::function<void()>> WorkQueueSequential;
std::vector<std::function<void()>> WorkStack;
std::mutex Mutex;
std::condition_variable Cond;
std::promise<void> ThreadsCreated;
Expand Down Expand Up @@ -214,16 +191,14 @@ TaskGroup::~TaskGroup() {
L.sync();
}

void TaskGroup::spawn(std::function<void()> F, bool Sequential) {
void TaskGroup::spawn(std::function<void()> F) {
#if LLVM_ENABLE_THREADS
if (Parallel) {
L.inc();
detail::Executor::getDefaultExecutor()->add(
[&, F = std::move(F)] {
F();
L.dec();
},
Sequential);
detail::Executor::getDefaultExecutor()->add([&, F = std::move(F)] {
F();
L.dec();
});
return;
}
#endif
Expand Down
10 changes: 0 additions & 10 deletions llvm/unittests/Support/ParallelTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -94,16 +94,6 @@ TEST(Parallel, ForEachError) {
EXPECT_EQ(errText, std::string("asdf\nasdf\nasdf"));
}

TEST(Parallel, TaskGroupSequentialFor) {
size_t Count = 0;
{
parallel::TaskGroup tg;
for (size_t Idx = 0; Idx < 500; Idx++)
tg.spawn([&Count, Idx]() { EXPECT_EQ(Count++, Idx); }, true);
}
EXPECT_EQ(Count, 500ul);
}

#if LLVM_ENABLE_THREADS
TEST(Parallel, NestedTaskGroup) {
// This test checks:
Expand Down
Loading