Skip to content

[Support] Restrict ManagedStatic ThreadPoolExecutor to Windows #102989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Aug 13, 2024

https://reviews.llvm.org/D70447 switched to ManagedStatic to work
around race conditions in MSVC runtimes and the MinGW runtime.
However, ManagedStatic is not suitable for other platforms.

However, this workaround is not suitable for other platforms (#66974).
lld::fatal calls exitLld(1), which calls llvm_shutdown to destroy
ManagedStatic objects. The worker threads will finish and invoke TLS
destructors (glibc __call_tls_dtors). This can lead to race conditions
if other threads attempt to access TLS objects that have already been
destroyed.

While lld's early exit mechanism needs more work, I believe Parallel.cpp
should avoid this pitfall as well.

Created using spr 1.3.5-bogner
@llvmbot
Copy link
Member

llvmbot commented Aug 13, 2024

@llvm/pr-subscribers-llvm-support

Author: Fangrui Song (MaskRay)

Changes

https://reviews.llvm.org/D70447 switched to ManagedStatic to work
around race conditions in MSVC runtimes and the MinGW runtime.
However, ManagedStatic is not suitable for other platforms.

However, this workaround is not suitable for other platforms (#66974).
lld::fatal calls exitLld(1), which calls llvm_shutdown to destroy
ManagedStatic objects. The worker threads will finish and invoke TLS
destructors (glibc __call_tls_dtors). This can lead to race conditions
if other threads attempt to access TLS objects that have already been
destroyed.

While lld's early exit mechanism needs more work, I believe Parallel.cpp
should avoid this pitfall as well.


Full diff: https://github.com/llvm/llvm-project/pull/102989.diff

1 Files Affected:

  • (modified) llvm/lib/Support/Parallel.cpp (+9)
diff --git a/llvm/lib/Support/Parallel.cpp b/llvm/lib/Support/Parallel.cpp
index af35947192c0db..849041f0e21505 100644
--- a/llvm/lib/Support/Parallel.cpp
+++ b/llvm/lib/Support/Parallel.cpp
@@ -156,6 +156,7 @@ class ThreadPoolExecutor : public Executor {
 };
 
 Executor *Executor::getDefaultExecutor() {
+#ifdef _WIN32
   // The ManagedStatic enables the ThreadPoolExecutor to be stopped via
   // llvm_shutdown() which allows a "clean" fast exit, e.g. via _exit(). This
   // stops the thread pool and waits for any worker thread creation to complete
@@ -179,6 +180,14 @@ Executor *Executor::getDefaultExecutor() {
       ManagedExec;
   static std::unique_ptr<ThreadPoolExecutor> Exec(&(*ManagedExec));
   return Exec.get();
+#else
+  // ManagedStatic is not desired on other platforms. When `Exec` is terminated
+  // by llvm_shutdown(), worker threads will clean up and invoke TLS
+  // destructors. This can lead to race conditions if other threads attempt to
+  // access TLS objects that have already been destroyed.
+  static ThreadPoolExecutor Exec(strategy);
+  return &Exec;
+#endif
 }
 } // namespace
 } // namespace detail

Copy link
Member

@mstorsjo mstorsjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Created using spr 1.3.5-bogner
@MaskRay MaskRay merged commit 9487cf9 into main Aug 14, 2024
5 of 7 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/support-restrict-managedstatic-threadpoolexecutor-to-windows branch August 14, 2024 00:02
@nga888
Copy link
Collaborator

nga888 commented Aug 20, 2024

I've been on PTO, but this LGTM. Only comment is that the mention of other threads accessing TLS objects reads a bit odd. IIUC, isn't the issue related to the fact that other threads access objects allocated by a different thread local allocator?

@MaskRay
Copy link
Member Author

MaskRay commented Aug 21, 2024

I've been on PTO, but this LGTM. Only comment is that the mention of other threads accessing TLS objects reads a bit odd. IIUC, isn't the issue related to the fact that other threads access objects allocated by a different thread local allocator?

Yes, the issue was due to threads access objects allocated by a different thread local allocator. The objects might be destroyed TLS destructors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants