Skip to content

[mlir] DistinctAttributeAllocator: fix race #132935

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

cota
Copy link
Contributor

@cota cota commented Mar 25, 2025

The bitfields added in #128566 (threadingIsEnabled : 1 and useThreadLocalAllocator : 1) are accessed without synchronization. Example tsan report:

WARNING: ThreadSanitizer: data race (pid=337)
  Write of size 1 at 0x7260001d0ff0 by thread T224:
    #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29
    #1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40
    #2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8
[...]
Previous write of size 1 at 0x7260001d0ff0 by thread T222:
    #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29
    #1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40
    #2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8

Fix the race by serializing accesses to these fields with the existing allocatorMutex.

@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir labels Mar 25, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 25, 2025

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-core

Author: Emilio Cota (cota)

Changes

The bitfields added in #128566 (threadingIsEnabled : 1 and useThreadLocalAllocator : 1) are accessed without synchronization. Example tsan report:

WARNING: ThreadSanitizer: data race (pid=337)
  Write of size 1 at 0x7260001d0ff0 by thread T224:
    #<!-- -->0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29
    #<!-- -->1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40
    #<!-- -->2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8
[...]
Previous write of size 1 at 0x7260001d0ff0 by thread T222:
    #<!-- -->0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29
    #<!-- -->1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40
    #<!-- -->2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8

Fix the race by serializing accesses to these fields with the existing allocatorMutex.


Full diff: https://github.com/llvm/llvm-project/pull/132935.diff

1 Files Affected:

  • (modified) mlir/lib/IR/AttributeDetail.h (+12-6)
diff --git a/mlir/lib/IR/AttributeDetail.h b/mlir/lib/IR/AttributeDetail.h
index 8fed18140433c..08ab3c0114265 100644
--- a/mlir/lib/IR/AttributeDetail.h
+++ b/mlir/lib/IR/AttributeDetail.h
@@ -413,16 +413,17 @@ class DistinctAttributeAllocator {
   /// Allocates a distinct attribute storage using a thread local bump pointer
   /// allocator to enable synchronization free parallel allocations.
   DistinctAttrStorage *allocate(Attribute referencedAttr) {
+    std::unique_lock<std::mutex> lock(allocatorMutex);
     if (!useThreadLocalAllocator && threadingIsEnabled) {
-      std::scoped_lock<std::mutex> lock(allocatorMutex);
-      return allocateImpl(referencedAttr);
+      return allocateImpl(referencedAttr, lock);
     }
-    return allocateImpl(referencedAttr);
+    return allocateImpl(referencedAttr, lock);
   }
 
   /// Sets a flag that stores if multithreading is enabled. The flag is used to
   /// decide if locking is needed when using a non thread-safe allocator.
   void disableMultiThreading(bool disable = true) {
+    std::scoped_lock<std::mutex> lock(allocatorMutex);
     threadingIsEnabled = !disable;
   }
 
@@ -431,12 +432,15 @@ class DistinctAttributeAllocator {
   /// beyond the lifetime of a child thread calling this function while ensuring
   /// thread-safe allocation.
   void disableThreadLocalStorage(bool disable = true) {
+    std::scoped_lock<std::mutex> lock(allocatorMutex);
     useThreadLocalAllocator = !disable;
   }
 
 private:
-  DistinctAttrStorage *allocateImpl(Attribute referencedAttr) {
-    return new (getAllocatorInUse().Allocate<DistinctAttrStorage>())
+  DistinctAttrStorage *allocateImpl(Attribute referencedAttr,
+                                    const std::unique_lock<std::mutex>& lock) {
+    assert(lock.owns_lock());
+    return new (getAllocatorInUse(lock).Allocate<DistinctAttrStorage>())
         DistinctAttrStorage(referencedAttr);
   }
 
@@ -444,7 +448,9 @@ class DistinctAttributeAllocator {
   /// thread-local, non-thread safe bump pointer allocator is used instead to
   /// prevent use-after-free errors whenever attribute storage created on a
   /// crash recover thread is accessed after the thread joins.
-  llvm::BumpPtrAllocator &getAllocatorInUse() {
+  llvm::BumpPtrAllocator &getAllocatorInUse(
+      const std::unique_lock<std::mutex>& lock) {
+    assert(lock.owns_lock());
     if (useThreadLocalAllocator)
       return allocatorCache.get();
     return allocator;

Copy link

github-actions bot commented Mar 25, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

The bitfields added in llvm#128566 (`threadingIsEnabled : 1`
and `useThreadLocalAllocator : 1`) are accessed without
synchronization. Example tsan report:

```
WARNING: ThreadSanitizer: data race (pid=337)
  Write of size 1 at 0x7260001d0ff0 by thread T224:
    #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29
    llvm#1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40
    llvm#2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8
[...]
Previous write of size 1 at 0x7260001d0ff0 by thread T222:
    #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29
    llvm#1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40
    llvm#2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8
```

Fix the race by serializing accesses to these fields with
the existing `allocatorMutex`.
@cota cota requested review from gysit and River707 March 25, 2025 13:51
@cota
Copy link
Contributor Author

cota commented Mar 25, 2025

@abulavin I cannot add you as a reviewer but please take a look. Thanks.

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff b24694371c62a71aab3550e983a6bf971ed721ff 7fbb72b897f5d3402cc22dd3c6edf56b46bf3402 --extensions h -- mlir/lib/IR/AttributeDetail.h
View the diff from clang-format here.
diff --git a/mlir/lib/IR/AttributeDetail.h b/mlir/lib/IR/AttributeDetail.h
index 08ab3c0114..6a481aad91 100644
--- a/mlir/lib/IR/AttributeDetail.h
+++ b/mlir/lib/IR/AttributeDetail.h
@@ -438,7 +438,7 @@ public:
 
 private:
   DistinctAttrStorage *allocateImpl(Attribute referencedAttr,
-                                    const std::unique_lock<std::mutex>& lock) {
+                                    const std::unique_lock<std::mutex> &lock) {
     assert(lock.owns_lock());
     return new (getAllocatorInUse(lock).Allocate<DistinctAttrStorage>())
         DistinctAttrStorage(referencedAttr);
@@ -448,8 +448,8 @@ private:
   /// thread-local, non-thread safe bump pointer allocator is used instead to
   /// prevent use-after-free errors whenever attribute storage created on a
   /// crash recover thread is accessed after the thread joins.
-  llvm::BumpPtrAllocator &getAllocatorInUse(
-      const std::unique_lock<std::mutex>& lock) {
+  llvm::BumpPtrAllocator &
+  getAllocatorInUse(const std::unique_lock<std::mutex> &lock) {
     assert(lock.owns_lock());
     if (useThreadLocalAllocator)
       return allocatorCache.get();

Copy link
Contributor

@gysit gysit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I am fine if this fixes the immediate problem. However, as you wrote on the other PR the solution defeats the purpose of avoiding lock contention.

When reviewing I was under the impression that we can check this flags safely since they are only set in single-thread code (for enable/disable multi-threading this is probably the case). But it seems like disableThreadLocalStorage is called from multi-thread code? Would introducing a lock in disableThreadLocalStorage avoid the problem as well?

@cota
Copy link
Contributor Author

cota commented Mar 25, 2025

Would introducing a lock in disableThreadLocalStorage avoid the problem as well?

That lock would protect useThreadLocalAllocator and we'd have to acquire the lock on all the other accesses to it. It's easier to reuse the existing lock.

When reviewing I was under the impression that we can check this flags safely since they are only set in single-thread code

DistinctAttributeAllocator is a member of MLIRContext and therefore can be accessed by different threads when threading is enabled. We should either (1) have the bitfields in DistinctAttributeAllocator as immutable members, therefore being read-only and thus requiring no atomics, or (2) properly deal with multi-threaded accesses if we don't want the bitfields to be immutable. I think @abulavin should make the call since they have all the context.

@gysit
Copy link
Contributor

gysit commented Mar 25, 2025

But it seems like disableThreadLocalStorage is called from multi-thread code? Would introducing a lock in disableThreadLocalStorage avoid the problem as well?

Note to myself. Locking probably does not help since PassManager::runWithCrashRecovery executes in parallel and we want to disable thread local storage whenever at least some pass manager with crash recovery is running.

I guess given this additional complication it may make sense to fall back to a bump pointer allocator & locking.

@joker-eph
Copy link
Collaborator

Note to myself. Locking probably does not help since PassManager::runWithCrashRecovery executes in parallel and we want to disable thread local storage whenever at least some pass manager with crash recovery is running.

We can't disable multi-threading while a pass manager is running I believe.

@gysit
Copy link
Contributor

gysit commented Mar 25, 2025

We can't disable multi-threading while a pass manager is running I believe.

Right, maybe a enable crash recovery flag that is set together the multi-threading flag could be used to decide which allocator to use.

@joker-eph
Copy link
Collaborator

I suspect there is a more fundamental problem here.
If PassManager::runWithCrashRecovery() is executed for multiple threads you can have the following sequence:

ctx->disableThreadLocalStorage(); // T1
ctx->disableThreadLocalStorage(); // T2
ctx->enableThreadLocalStorage(); // T1

At this point T2 can execute while the thread local storage it enabled again.
If we need to lock, I suspect the scope has to be for the entire duration of PassManager::runWithCrashRecovery() ; or use a semaphore mechanism (increment an atomic integer instead of using a boolean).

Assuming I am correct with this, I would instead revert the original PR.

@gysit
Copy link
Contributor

gysit commented Mar 25, 2025

At this point T2 can execute while the thread local storage it enabled again.
If we need to lock, I suspect the scope has to be for the entire duration of PassManager::runWithCrashRecovery() ; or use a semaphore mechanism (increment an atomic integer instead of using a boolean).

I believe the example sounds indeed correct and a more complex fix is needed :(. In that case reverting may indeed be the better option. The downside is that it reintroduces the bug we had before.

@cota
Copy link
Contributor Author

cota commented Mar 25, 2025

I also prefer the revert. Will revert #128566 now. Thanks everyone.

cota added a commit that referenced this pull request Mar 25, 2025
…e when crash reproduction is enabled" (#133000)

Reverts #128566. See as well the discussion in
#132935.
@cota cota closed this Mar 25, 2025
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 25, 2025
…bute storage when crash reproduction is enabled" (#133000)

Reverts llvm/llvm-project#128566. See as well the discussion in
llvm/llvm-project#132935.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants