-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[mlir] DistinctAttributeAllocator: fix race #132935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-core Author: Emilio Cota (cota) ChangesThe bitfields added in #128566 (
Fix the race by serializing accesses to these fields with the existing Full diff: https://github.com/llvm/llvm-project/pull/132935.diff 1 Files Affected:
diff --git a/mlir/lib/IR/AttributeDetail.h b/mlir/lib/IR/AttributeDetail.h
index 8fed18140433c..08ab3c0114265 100644
--- a/mlir/lib/IR/AttributeDetail.h
+++ b/mlir/lib/IR/AttributeDetail.h
@@ -413,16 +413,17 @@ class DistinctAttributeAllocator {
/// Allocates a distinct attribute storage using a thread local bump pointer
/// allocator to enable synchronization free parallel allocations.
DistinctAttrStorage *allocate(Attribute referencedAttr) {
+ std::unique_lock<std::mutex> lock(allocatorMutex);
if (!useThreadLocalAllocator && threadingIsEnabled) {
- std::scoped_lock<std::mutex> lock(allocatorMutex);
- return allocateImpl(referencedAttr);
+ return allocateImpl(referencedAttr, lock);
}
- return allocateImpl(referencedAttr);
+ return allocateImpl(referencedAttr, lock);
}
/// Sets a flag that stores if multithreading is enabled. The flag is used to
/// decide if locking is needed when using a non thread-safe allocator.
void disableMultiThreading(bool disable = true) {
+ std::scoped_lock<std::mutex> lock(allocatorMutex);
threadingIsEnabled = !disable;
}
@@ -431,12 +432,15 @@ class DistinctAttributeAllocator {
/// beyond the lifetime of a child thread calling this function while ensuring
/// thread-safe allocation.
void disableThreadLocalStorage(bool disable = true) {
+ std::scoped_lock<std::mutex> lock(allocatorMutex);
useThreadLocalAllocator = !disable;
}
private:
- DistinctAttrStorage *allocateImpl(Attribute referencedAttr) {
- return new (getAllocatorInUse().Allocate<DistinctAttrStorage>())
+ DistinctAttrStorage *allocateImpl(Attribute referencedAttr,
+ const std::unique_lock<std::mutex>& lock) {
+ assert(lock.owns_lock());
+ return new (getAllocatorInUse(lock).Allocate<DistinctAttrStorage>())
DistinctAttrStorage(referencedAttr);
}
@@ -444,7 +448,9 @@ class DistinctAttributeAllocator {
/// thread-local, non-thread safe bump pointer allocator is used instead to
/// prevent use-after-free errors whenever attribute storage created on a
/// crash recover thread is accessed after the thread joins.
- llvm::BumpPtrAllocator &getAllocatorInUse() {
+ llvm::BumpPtrAllocator &getAllocatorInUse(
+ const std::unique_lock<std::mutex>& lock) {
+ assert(lock.owns_lock());
if (useThreadLocalAllocator)
return allocatorCache.get();
return allocator;
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
The bitfields added in llvm#128566 (`threadingIsEnabled : 1` and `useThreadLocalAllocator : 1`) are accessed without synchronization. Example tsan report: ``` WARNING: ThreadSanitizer: data race (pid=337) Write of size 1 at 0x7260001d0ff0 by thread T224: #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29 llvm#1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40 llvm#2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8 [...] Previous write of size 1 at 0x7260001d0ff0 by thread T222: #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29 llvm#1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40 llvm#2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8 ``` Fix the race by serializing accesses to these fields with the existing `allocatorMutex`.
@abulavin I cannot add you as a reviewer but please take a look. Thanks. |
You can test this locally with the following command:git-clang-format --diff b24694371c62a71aab3550e983a6bf971ed721ff 7fbb72b897f5d3402cc22dd3c6edf56b46bf3402 --extensions h -- mlir/lib/IR/AttributeDetail.h View the diff from clang-format here.diff --git a/mlir/lib/IR/AttributeDetail.h b/mlir/lib/IR/AttributeDetail.h
index 08ab3c0114..6a481aad91 100644
--- a/mlir/lib/IR/AttributeDetail.h
+++ b/mlir/lib/IR/AttributeDetail.h
@@ -438,7 +438,7 @@ public:
private:
DistinctAttrStorage *allocateImpl(Attribute referencedAttr,
- const std::unique_lock<std::mutex>& lock) {
+ const std::unique_lock<std::mutex> &lock) {
assert(lock.owns_lock());
return new (getAllocatorInUse(lock).Allocate<DistinctAttrStorage>())
DistinctAttrStorage(referencedAttr);
@@ -448,8 +448,8 @@ private:
/// thread-local, non-thread safe bump pointer allocator is used instead to
/// prevent use-after-free errors whenever attribute storage created on a
/// crash recover thread is accessed after the thread joins.
- llvm::BumpPtrAllocator &getAllocatorInUse(
- const std::unique_lock<std::mutex>& lock) {
+ llvm::BumpPtrAllocator &
+ getAllocatorInUse(const std::unique_lock<std::mutex> &lock) {
assert(lock.owns_lock());
if (useThreadLocalAllocator)
return allocatorCache.get();
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I am fine if this fixes the immediate problem. However, as you wrote on the other PR the solution defeats the purpose of avoiding lock contention.
When reviewing I was under the impression that we can check this flags safely since they are only set in single-thread code (for enable/disable multi-threading this is probably the case). But it seems like disableThreadLocalStorage
is called from multi-thread code? Would introducing a lock in disableThreadLocalStorage
avoid the problem as well?
That lock would protect
|
Note to myself. Locking probably does not help since I guess given this additional complication it may make sense to fall back to a bump pointer allocator & locking. |
We can't disable multi-threading while a pass manager is running I believe. |
Right, maybe a enable crash recovery flag that is set together the multi-threading flag could be used to decide which allocator to use. |
I suspect there is a more fundamental problem here.
At this point T2 can execute while the thread local storage it enabled again. Assuming I am correct with this, I would instead revert the original PR. |
I believe the example sounds indeed correct and a more complex fix is needed :(. In that case reverting may indeed be the better option. The downside is that it reintroduces the bug we had before. |
I also prefer the revert. Will revert #128566 now. Thanks everyone. |
…bute storage when crash reproduction is enabled" (#133000) Reverts llvm/llvm-project#128566. See as well the discussion in llvm/llvm-project#132935.
The bitfields added in #128566 (
threadingIsEnabled : 1
anduseThreadLocalAllocator : 1
) are accessed without synchronization. Example tsan report:Fix the race by serializing accesses to these fields with the existing
allocatorMutex
.