[mlir] Fix DistinctAttributeUniquer deleting attribute storage when crash reproduction is enabled #128566

abulavin · 2025-02-24T20:22:14Z

Currently, DistinctAttr uses an allocator wrapped in a ThreadLocalCache to manage attribute storage allocations. This ensures all allocations are freed when the allocator is destroyed.

However, this setup can cause use-after-free errors when mlir::PassManager runs its passes on a separate thread as a result of crash reproduction being enabled. Distinct attribute storages are created in the child thread's local storage and freed once the thread joins. Attempting to access these attributes after this can result in segmentation faults, such as during printing or alias analysis.

Example: This invocation of mlir-opt demonstrates the segfault issue due to distinct attributes being created in a child thread and their storage being freed once the thread joins:

mlir-opt --mlir-pass-pipeline-crash-reproducer=. --test-distinct-attrs mlir/test/IR/test-builtin-distinct-attrs.mlir

This pull request changes the distinct attribute allocator to use different allocators depending on whether or not threading is enabled and whether or not the pass manager is running its passes in a separate thread. If multithreading is disabled, a non thread-local allocator is used. If threading remains enabled and the pass manager invokes its pass pipelines in a child thread, then a non-thread local but synchronised allocator is used. This ensures that the lifetime of allocated storage persists beyond the lifetime of the child thread.

I have added two tests for the -test-distinct-attrs pass and the -enable-debug-info-on-llvm-scope passes that run them with crash reproduction enabled.

…abled

github-actions · 2025-02-24T20:22:39Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-02-24T20:23:12Z

@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-mlir-core

Author: Artemiy Bulavin (abulavin)

Changes

Currently, DistinctAttr uses a specific allocator wrapped in a ThreadLocalCache to manage attribute storage allocations. This ensures all allocations are freed when the allocator is destroyed.

However, this setup can cause use-after-free errors when mlir::PassManager runs on a separate thread as a result of crash reproduction being enabled. Distinct attribute storages are created in the child thread's local storage and freed once the thread joins. Attempting to access these attributes after this can result in segmentation faults, such as during printing or alias analysis.

Example: This invocation of mlir-opt demonstrates the segfault issue due to distinct attributes being created in a child thread and their storage being freed once the thread joins:

mlir-opt --mlir-disable-threading --mlir-pass-pipeline-crash-reproducer=. --test-distinct-attrs mlir/test/IR/test-builtin-distinct-attrs.mlir

This pull request changes the distinct attribute allocator to use just BumpPtrAllocator when threading is disabled in the MLIRContext. The absence of threading caused by --mlir-disable-threading means ThreadLocalCache is unnecessary. This change also avoids segmentation faults when generating local reproducers with crash reproduction.

I have added tests specifically for the -test-distinct-attrs pass and the enable-debug-info-on-llvm-scope passes that cover the distinct attr case speficifcally and also add a test case for a pass that creates these distinct attributes (the latter is how I discovered this issue.)

Full diff: https://github.com/llvm/llvm-project/pull/128566.diff

4 Files Affected:

(modified) mlir/lib/IR/AttributeDetail.h (+20-2)
(modified) mlir/lib/IR/MLIRContext.cpp (+1)
(added) mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir (+19)
(added) mlir/test/IR/test-builtin-distinct-attrs-with-crash-reproduction.mlir (+17)

diff --git a/mlir/lib/IR/AttributeDetail.h b/mlir/lib/IR/AttributeDetail.h
index 26d40ac3a38f6..6787dc6d3e698 100644
--- a/mlir/lib/IR/AttributeDetail.h
+++ b/mlir/lib/IR/AttributeDetail.h
@@ -23,6 +23,7 @@
 #include "mlir/Support/ThreadLocalCache.h"
 #include "llvm/ADT/APFloat.h"
 #include "llvm/ADT/PointerIntPair.h"
+#include "llvm/Support/Allocator.h"
 #include "llvm/Support/TrailingObjects.h"
 
 namespace mlir {
@@ -409,14 +410,31 @@ class DistinctAttributeAllocator {
   operator=(const DistinctAttributeAllocator &) = delete;
 
   /// Allocates a distinct attribute storage using a thread local bump pointer
-  /// allocator to enable synchronization free parallel allocations.
+  /// allocator to enable synchronization free parallel allocations. If
+  /// threading is disabled on the owning MLIR context, a normal non
+  /// thread-local bump pointer allocator is used instead to prevent
+  /// use-after-free errors whenever attribute storage created on a crash
+  /// recover thread is accessed after the thread joins.
   DistinctAttrStorage *allocate(Attribute referencedAttr) {
-    return new (allocatorCache.get().Allocate<DistinctAttrStorage>())
+    return new (getAllocatorInUse().Allocate<DistinctAttrStorage>())
         DistinctAttrStorage(referencedAttr);
   }
 
+  void disableMultithreading(bool disable = true) {
+    useThreadLocalCache = !disable;
+  };
+
 private:
+  llvm::BumpPtrAllocator &getAllocatorInUse() {
+    if (useThreadLocalCache) {
+      return allocatorCache.get();
+    }
+    return allocator;
+  }
+
   ThreadLocalCache<llvm::BumpPtrAllocator> allocatorCache;
+  llvm::BumpPtrAllocator allocator;
+  bool useThreadLocalCache : 1;
 };
 } // namespace detail
 } // namespace mlir
diff --git a/mlir/lib/IR/MLIRContext.cpp b/mlir/lib/IR/MLIRContext.cpp
index 87782e84dd6e4..04f2cf8f7b1ec 100644
--- a/mlir/lib/IR/MLIRContext.cpp
+++ b/mlir/lib/IR/MLIRContext.cpp
@@ -596,6 +596,7 @@ void MLIRContext::disableMultithreading(bool disable) {
   // Update the threading mode for each of the uniquers.
   impl->affineUniquer.disableMultithreading(disable);
   impl->attributeUniquer.disableMultithreading(disable);
+  impl->distinctAttributeAllocator.disableMultithreading(disable);
   impl->typeUniquer.disableMultithreading(disable);
 
   // Destroy thread pool (stop all threads) if it is no longer needed, or create
diff --git a/mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir b/mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir
new file mode 100644
index 0000000000000..5f49df419bccb
--- /dev/null
+++ b/mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir
@@ -0,0 +1,19 @@
+// Test that the enable-debug-info-scope-on-llvm-func pass can create its
+// DI attributes when running in the crash reproducer thread,
+
+// RUN: mlir-opt --mlir-disable-threading --mlir-pass-pipeline-crash-reproducer=. \
+// RUN:          --pass-pipeline="builtin.module(ensure-debug-info-scope-on-llvm-func)" \
+// RUN:          --mlir-print-debuginfo %s | FileCheck %s
+
+module {
+  llvm.func @func_no_debug() {
+    llvm.return loc(unknown)
+  } loc(unknown)
+} loc(unknown)
+
+// CHECK-LABEL: llvm.func @func_no_debug()
+// CHECK: llvm.return loc(#loc
+// CHECK: loc(#loc[[LOC:[0-9]+]])
+// CHECK: #di_file = #llvm.di_file<"<unknown>" in "">
+// CHECK: #di_subprogram = #llvm.di_subprogram<id = distinct[{{.*}}]<>, compileUnit = #di_compile_unit, scope = #di_file, name = "func_no_debug", linkageName = "func_no_debug", file = #di_file, line = 1, scopeLine = 1, subprogramFlags = "Definition|Optimized", type = #di_subroutine_type>
+// CHECK: #loc[[LOC]] = loc(fused<#di_subprogram>
diff --git a/mlir/test/IR/test-builtin-distinct-attrs-with-crash-reproduction.mlir b/mlir/test/IR/test-builtin-distinct-attrs-with-crash-reproduction.mlir
new file mode 100644
index 0000000000000..5c3b78bc6e154
--- /dev/null
+++ b/mlir/test/IR/test-builtin-distinct-attrs-with-crash-reproduction.mlir
@@ -0,0 +1,17 @@
+// This is a regression test that verifies that when running with crash
+// reproduction enabled, distinct attribute storage is not allocated in
+// thread-local storage. Since crash reproduction runs the pass manager in a
+// separate thread, using thread-local storage for distinct attributes causes
+// use-after-free errors once the thread that runs the pass manager joins.
+
+// RUN: mlir-opt --mlir-disable-threading --mlir-pass-pipeline-crash-reproducer=. %s -test-distinct-attrs | FileCheck %s
+
+// CHECK: #[[DIST0:.*]] = distinct[0]<42 : i32>
+// CHECK: #[[DIST1:.*]] = distinct[1]<42 : i32>
+#distinct = distinct[0]<42 : i32>
+
+// CHECK: @foo_1
+func.func @foo_1() {
+  // CHECK: "test.op"() {distinct.input = #[[DIST0]], distinct.output = #[[DIST1]]}
+  "test.op"() {distinct.input = #distinct} : () -> ()
+}

llvmbot · 2025-02-24T20:23:12Z

@llvm/pr-subscribers-mlir

Author: Artemiy Bulavin (abulavin)

Changes

Currently, DistinctAttr uses a specific allocator wrapped in a ThreadLocalCache to manage attribute storage allocations. This ensures all allocations are freed when the allocator is destroyed.

However, this setup can cause use-after-free errors when mlir::PassManager runs on a separate thread as a result of crash reproduction being enabled. Distinct attribute storages are created in the child thread's local storage and freed once the thread joins. Attempting to access these attributes after this can result in segmentation faults, such as during printing or alias analysis.

Example: This invocation of mlir-opt demonstrates the segfault issue due to distinct attributes being created in a child thread and their storage being freed once the thread joins:

mlir-opt --mlir-disable-threading --mlir-pass-pipeline-crash-reproducer=. --test-distinct-attrs mlir/test/IR/test-builtin-distinct-attrs.mlir

This pull request changes the distinct attribute allocator to use just BumpPtrAllocator when threading is disabled in the MLIRContext. The absence of threading caused by --mlir-disable-threading means ThreadLocalCache is unnecessary. This change also avoids segmentation faults when generating local reproducers with crash reproduction.

I have added tests specifically for the -test-distinct-attrs pass and the enable-debug-info-on-llvm-scope passes that cover the distinct attr case speficifcally and also add a test case for a pass that creates these distinct attributes (the latter is how I discovered this issue.)

Full diff: https://github.com/llvm/llvm-project/pull/128566.diff

4 Files Affected:

(modified) mlir/lib/IR/AttributeDetail.h (+20-2)
(modified) mlir/lib/IR/MLIRContext.cpp (+1)
(added) mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir (+19)
(added) mlir/test/IR/test-builtin-distinct-attrs-with-crash-reproduction.mlir (+17)

diff --git a/mlir/lib/IR/AttributeDetail.h b/mlir/lib/IR/AttributeDetail.h
index 26d40ac3a38f6..6787dc6d3e698 100644
--- a/mlir/lib/IR/AttributeDetail.h
+++ b/mlir/lib/IR/AttributeDetail.h
@@ -23,6 +23,7 @@
 #include "mlir/Support/ThreadLocalCache.h"
 #include "llvm/ADT/APFloat.h"
 #include "llvm/ADT/PointerIntPair.h"
+#include "llvm/Support/Allocator.h"
 #include "llvm/Support/TrailingObjects.h"
 
 namespace mlir {
@@ -409,14 +410,31 @@ class DistinctAttributeAllocator {
   operator=(const DistinctAttributeAllocator &) = delete;
 
   /// Allocates a distinct attribute storage using a thread local bump pointer
-  /// allocator to enable synchronization free parallel allocations.
+  /// allocator to enable synchronization free parallel allocations. If
+  /// threading is disabled on the owning MLIR context, a normal non
+  /// thread-local bump pointer allocator is used instead to prevent
+  /// use-after-free errors whenever attribute storage created on a crash
+  /// recover thread is accessed after the thread joins.
   DistinctAttrStorage *allocate(Attribute referencedAttr) {
-    return new (allocatorCache.get().Allocate<DistinctAttrStorage>())
+    return new (getAllocatorInUse().Allocate<DistinctAttrStorage>())
         DistinctAttrStorage(referencedAttr);
   }
 
+  void disableMultithreading(bool disable = true) {
+    useThreadLocalCache = !disable;
+  };
+
 private:
+  llvm::BumpPtrAllocator &getAllocatorInUse() {
+    if (useThreadLocalCache) {
+      return allocatorCache.get();
+    }
+    return allocator;
+  }
+
   ThreadLocalCache<llvm::BumpPtrAllocator> allocatorCache;
+  llvm::BumpPtrAllocator allocator;
+  bool useThreadLocalCache : 1;
 };
 } // namespace detail
 } // namespace mlir
diff --git a/mlir/lib/IR/MLIRContext.cpp b/mlir/lib/IR/MLIRContext.cpp
index 87782e84dd6e4..04f2cf8f7b1ec 100644
--- a/mlir/lib/IR/MLIRContext.cpp
+++ b/mlir/lib/IR/MLIRContext.cpp
@@ -596,6 +596,7 @@ void MLIRContext::disableMultithreading(bool disable) {
   // Update the threading mode for each of the uniquers.
   impl->affineUniquer.disableMultithreading(disable);
   impl->attributeUniquer.disableMultithreading(disable);
+  impl->distinctAttributeAllocator.disableMultithreading(disable);
   impl->typeUniquer.disableMultithreading(disable);
 
   // Destroy thread pool (stop all threads) if it is no longer needed, or create
diff --git a/mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir b/mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir
new file mode 100644
index 0000000000000..5f49df419bccb
--- /dev/null
+++ b/mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir
@@ -0,0 +1,19 @@
+// Test that the enable-debug-info-scope-on-llvm-func pass can create its
+// DI attributes when running in the crash reproducer thread,
+
+// RUN: mlir-opt --mlir-disable-threading --mlir-pass-pipeline-crash-reproducer=. \
+// RUN:          --pass-pipeline="builtin.module(ensure-debug-info-scope-on-llvm-func)" \
+// RUN:          --mlir-print-debuginfo %s | FileCheck %s
+
+module {
+  llvm.func @func_no_debug() {
+    llvm.return loc(unknown)
+  } loc(unknown)
+} loc(unknown)
+
+// CHECK-LABEL: llvm.func @func_no_debug()
+// CHECK: llvm.return loc(#loc
+// CHECK: loc(#loc[[LOC:[0-9]+]])
+// CHECK: #di_file = #llvm.di_file<"<unknown>" in "">
+// CHECK: #di_subprogram = #llvm.di_subprogram<id = distinct[{{.*}}]<>, compileUnit = #di_compile_unit, scope = #di_file, name = "func_no_debug", linkageName = "func_no_debug", file = #di_file, line = 1, scopeLine = 1, subprogramFlags = "Definition|Optimized", type = #di_subroutine_type>
+// CHECK: #loc[[LOC]] = loc(fused<#di_subprogram>
diff --git a/mlir/test/IR/test-builtin-distinct-attrs-with-crash-reproduction.mlir b/mlir/test/IR/test-builtin-distinct-attrs-with-crash-reproduction.mlir
new file mode 100644
index 0000000000000..5c3b78bc6e154
--- /dev/null
+++ b/mlir/test/IR/test-builtin-distinct-attrs-with-crash-reproduction.mlir
@@ -0,0 +1,17 @@
+// This is a regression test that verifies that when running with crash
+// reproduction enabled, distinct attribute storage is not allocated in
+// thread-local storage. Since crash reproduction runs the pass manager in a
+// separate thread, using thread-local storage for distinct attributes causes
+// use-after-free errors once the thread that runs the pass manager joins.
+
+// RUN: mlir-opt --mlir-disable-threading --mlir-pass-pipeline-crash-reproducer=. %s -test-distinct-attrs | FileCheck %s
+
+// CHECK: #[[DIST0:.*]] = distinct[0]<42 : i32>
+// CHECK: #[[DIST1:.*]] = distinct[1]<42 : i32>
+#distinct = distinct[0]<42 : i32>
+
+// CHECK: @foo_1
+func.func @foo_1() {
+  // CHECK: "test.op"() {distinct.input = #[[DIST0]], distinct.output = #[[DIST1]]}
+  "test.op"() {distinct.input = #distinct} : () -> ()
+}

gysit

Thanks for fixing this issue! I added some nit comments.

I did not fully understand why there is a thread join when running with the crash reproducer. Isn't multi-threading disabled from the beginning in this case? I would have assumed the ThreadLocalCache should still work and only free the allocated attributes once the main thread terminates.

In any case, the change makes sense since there is no need for thread local allocators if multi-threading is disabled.

mlir/lib/IR/AttributeDetail.h

mlir/test/Dialect/LLVMIR/add-debuginfo-func-scope-with-crash-reproduction.mlir

mlir/lib/IR/AttributeDetail.h

joker-eph · 2025-02-25T12:40:23Z

Trying to see if I understand why it's not a problem when multi-threading is enabled: is this because we never joins thread for the lifetime of the MLIRContext?

Also, you're fixing the issue for -mlir-pass-pipeline-crash-reproducer when multi-threading is disabled, but what about the combination of -mlir-pass-pipeline-crash-reproducer and multi-threading?

abulavin · 2025-02-25T19:19:42Z

I did not fully understand why there is a thread join when running with the crash reproducer. Isn't multi-threading disabled from the beginning in this case? I would have assumed the ThreadLocalCache should still work and only free the allocated attributes once the main thread terminates.

When crash reproduction is enabled, the pass manager runs its pass pipeline on a separate thread here. This thread is created and joined outside of the MLIR context. My understanding is that disabling threading only disables the MLIR context threadpool, which has no effect on the thread in the link I provided that's used to run the pass pipeline. So the thread local cache (as intended) will allocate distinct attributes in storage local to this special 'crash recovery' thread, even when threading is disabled.

Trying to see if I understand why it's not a problem when multi-threading is enabled: is this because we never joins thread for the lifetime of the MLIRContext?

Also, you're fixing the issue for -mlir-pass-pipeline-crash-reproducer when multi-threading is disabled, but what about the combination of -mlir-pass-pipeline-crash-reproducer and multi-threading?

Ah apologies, I believe I have accidentally missed out some important information on this PR. The initial issue I was investigating involved generating local crash reproducers, which requires multithreading to be disabled. I need to double check what happens for regular crash reproducers with multi threading still enabled. Thanks for pointing this out.

…oducer generation

abulavin · 2025-02-28T16:01:33Z

@joker-eph @gysit Thanks for the review comments. After Mehdi's suggestion to test what happens when multi-threading and crash reproduction are both enabled, I have had to make some significant changes to fix this issue. As it turns out, enabling multithreading will still exhibit the same use-after-free error because the lifetime of the thread-local storage only lasts for the lifetime of the crash reproducer thread. To remedy this, I have added some extra logic to disable the use of thread-local storage in the DistinctAttributeAllocator when the PassManager runs with crash recovery enabled. Since in this case disabling thread-local storage now means thread-safety is a concern, if threading is not disabled then a synchronised allocator is used. I appreciate this may be less than ideal so I am open to suggestions to remedy this.

Please let me know if anything is not clear.

gysit

It seems like we need this additional flag for the crash recovery use case. We could of course drop thread local storage at the cost of more lock contention. It is unclear to me if that would be a good performance vs implementation complexity tradeoff. Probably not.

Another option could be to find a way to prolong the lifetime of the entries in the ThreadLocalCache. I doubt there is a good/low overhead solution though given it relies on thread local variables.

mlir/lib/IR/MLIRContext.cpp

mlir/lib/Pass/PassCrashRecovery.cpp

gysit

I missed the unreachable during the last review. I think if possible we should avoid it and I would also let the two disable functions just set one flag each and then interpret the flags in the allocate function.

Also the change is a bit outside of my area of expertise so it would be good to give the more knowledgeable reviewers time to have a look. Maybe @joker-eph could have a look to decide if the general approach makes sense.

mlir/lib/Pass/PassCrashRecovery.cpp

mlir/lib/IR/AttributeDetail.h

gysit

Nice, thanks for addressing the changes!

LGTM modulo some outdated comment.

Please give other reviewers some time to have a second look.

mlir/lib/IR/AttributeDetail.h

abulavin · 2025-03-05T17:40:27Z

Please give other reviewers some time to have a second look.

Thank you for your review comments! I can see that the github bot has not added any more reviewers to this PR. Should other people be pinged on this? If so, can you suggest who may be interested in reviewing this PR?

gysit · 2025-03-05T18:07:33Z

I added two more reviewers who may be interested. Feel free to land in a few days should there be no concerns.

River707

How much of a burden would it be to just enforce no multi-threading when doing crash reproducers? I'm a little worried about the complexity of differentiating thread local storage from multithreading being on.

mlir/include/mlir/IR/MLIRContext.h

River707 · 2025-03-05T22:53:46Z

mlir/lib/IR/AttributeDetail.h

@@ -23,7 +23,9 @@
 #include "mlir/Support/ThreadLocalCache.h"
 #include "llvm/ADT/APFloat.h"
 #include "llvm/ADT/PointerIntPair.h"
+#include "llvm/Support/ErrorHandling.h"


Why do you need error handling?

This is left over from a previous commit, apologies. I would like to clarify though, since clangd tells me there are other headers unused here, shall I remove those too as an en-passant tidy? Notably,

"mlir/Support/StorageUniquer.h"

"llvm/ADT/PointerIntPair.h"

"llvm/Support/TrailingObjects.h"

… storage

abulavin · 2025-03-06T17:18:31Z

How much of a burden would it be to just enforce no multi-threading when doing crash reproducers?

This has pass pipeline performance implications for downstream projects that have crash reproduction always on. For example, the downstream project https://github.com/triton-lang/triton has crash reproducer always enabled when running the pass manager, as you can see here: https://github.com/triton-lang/triton/blob/main/python/src/ir.cc#L1817-L1839. If we enforce multithreading to be disabled then projects like this will not be able to benefit from pass parallelisation.

I appreciate that disabling/enabling thread local storage does add complexity however my intention with this PR was not to necessarily make this part of the MLIRContext's public interface. It's only really the pass manager that needs to disable thread-local storage for safety reasons but I could not find a nice way to do this without affecting MLIRContext.h. I think if this complexity could somehow be abstracted away without affecting the MLIRContext's public methods then it minimises its impact.

As @gysit has mentioned, I think the tradeoff here is we either use a lock around the allocator and accept lock contention cost but avoid complexity surrounding toggling thread-local storage, or, we accept this complexity in favour of avoiding lock contention overhead. My opinion is that if we choose to drop using thread-local storage for DistinctAttributeStorage, the performance impact on MLIR overall will be minimal as it only affects the storage allocation for this one attribute. I don't foresee a situation where allocating DistinctAttributeStorage suddenly becomes a performance bottleneck due to there being a lock versus thread-local storage.

abulavin · 2025-03-12T22:28:35Z

@River707 @gysit thanks for the reviews! Please can you merge this PR for me as I do not have write access? Thanks

gysit · 2025-03-13T06:56:28Z

I am happy to land for you.

Can you update the PR description and title. I think the paragraph that describes the change is outdated now and it may be confusing to have this in the commit history. Especially since the plain allocator is not only used when threading is disabled.

I would also recommend a shorter titel such as [mlir] Fix DistinctAttributeUniquer or similar since the problem is not limited to the disable threading case.

abulavin · 2025-03-13T13:02:47Z

Thanks, I have updated the title and description and am all done now.

github-actions · 2025-03-13T14:00:59Z

@abulavin Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

…rash reproduction is enabled (llvm#128566) Currently, `DistinctAttr` uses an allocator wrapped in a `ThreadLocalCache` to manage attribute storage allocations. This ensures all allocations are freed when the allocator is destroyed. However, this setup can cause use-after-free errors when `mlir::PassManager` runs its passes on a separate thread as a result of crash reproduction being enabled. Distinct attribute storages are created in the child thread's local storage and freed once the thread joins. Attempting to access these attributes after this can result in segmentation faults, such as during printing or alias analysis. Example: This invocation of `mlir-opt` demonstrates the segfault issue due to distinct attributes being created in a child thread and their storage being freed once the thread joins: ``` mlir-opt --mlir-pass-pipeline-crash-reproducer=. --test-distinct-attrs mlir/test/IR/test-builtin-distinct-attrs.mlir ``` This pull request changes the distinct attribute allocator to use different allocators depending on whether or not threading is enabled and whether or not the pass manager is running its passes in a separate thread. If multithreading is disabled, a non thread-local allocator is used. If threading remains enabled and the pass manager invokes its pass pipelines in a child thread, then a non-thread local but synchronised allocator is used. This ensures that the lifetime of allocated storage persists beyond the lifetime of the child thread. I have added two tests for the `-test-distinct-attrs` pass and the `-enable-debug-info-on-llvm-scope` passes that run them with crash reproduction enabled.

Updating LLVM in order to pull in the following change: - llvm/llvm-project#128566 For context, crash reproduction generation in MLIR will run the `PassManager`'s passes in a child thread. The above PR fixes crashes for when passes such as `add_di_scope` add `DistinctAttr` to the IR and their storage is then accessed later once the child thread joins. Pulling this in improves QoL for out-of-tree projects and makes the pass manager more robust to the use of `DistinctAttr`. This pin update has also introduced the deprecation of a `llvm::TargetMachine::createTargetMachine` overload. I've updated the callsites to use the non-deprecated overloads. # New contributor declaration - [x] I am not making a trivial change, such as fixing a typo in a comment. - [x] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [x] This PR does not need a test because `this PR only updates the LLVM pin, so CI is sufficient`. - Select one of the following. - [x] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.)

cota · 2025-03-25T13:48:51Z

@abulavin this is causing TSAN failures in downstream projects that we have at Google.
The problem is that the bitfields added here (threadingIsEnabled : 1 and useThreadLocalAllocator : 1) are accessed without synchronization. Example tsan report:

WARNING: ThreadSanitizer: data race (pid=337)
  Write of size 1 at 0x7260001d0ff0 by thread T224:
    #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29
    #1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40
    #2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8
[...]
Previous write of size 1 at 0x7260001d0ff0 by thread T222:
    #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29 
    #1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40
    #2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8

An obvious fix is to also access the bit fields with allocatorMutex being locked. However from reading the thread above it seems that your intent was to avoid contention if possible. You could define an std::atomic to keep these bits in there, but you would still have to refactor the code to only read the flag once per call to allocate.

Given that this is introducing a race, I'd normally revert the PR. However, since it is also fixing a crash, I'd suggest we apply the obvious, non-scalable fix first, then you can work on making this more performant. If that sounds good, I've uploaded #132935 for review.

The bitfields added in llvm#128566 (`threadingIsEnabled : 1` and `useThreadLocalAllocator : 1`) are accessed without synchronization. Example tsan report: ``` WARNING: ThreadSanitizer: data race (pid=337) Write of size 1 at 0x7260001d0ff0 by thread T224: #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29 llvm#1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40 llvm#2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8 [...] Previous write of size 1 at 0x7260001d0ff0 by thread T222: #0 disableThreadLocalStorage third_party/llvm/llvm-project/mlir/lib/IR/AttributeDetail.h:434:29 llvm#1 mlir::MLIRContext::disableThreadLocalStorage(bool) third_party/llvm/llvm-project/mlir/lib/IR/MLIRContext.cpp:723:40 llvm#2 mlir::PassManager::runWithCrashRecovery(mlir::Operation*, mlir::AnalysisManager) third_party/llvm/llvm-project/mlir/lib/Pass/PassCrashRecovery.cpp:423:8 ``` Fix the race by serializing accesses to these fields with the existing `allocatorMutex`.

Updating LLVM in order to pull in the following change: - llvm/llvm-project#128566 For context, crash reproduction generation in MLIR will run the `PassManager`'s passes in a child thread. The above PR fixes crashes for when passes such as `add_di_scope` add `DistinctAttr` to the IR and their storage is then accessed later once the child thread joins. Pulling this in improves QoL for out-of-tree projects and makes the pass manager more robust to the use of `DistinctAttr`. This pin update has also introduced the deprecation of a `llvm::TargetMachine::createTargetMachine` overload. I've updated the callsites to use the non-deprecated overloads. - [x] I am not making a trivial change, such as fixing a typo in a comment. - [x] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [x] This PR does not need a test because `this PR only updates the LLVM pin, so CI is sufficient`. - Select one of the following. - [x] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.)

cota · 2025-03-25T21:34:16Z

There is a follow-up discussion in #132935 and the conclusion is that we should revert this PR. Will do so now.

…e when crash reproduction is enabled" (#133000) Reverts #128566. See as well the discussion in #132935.

…bute storage when crash reproduction is enabled" (#133000) Reverts llvm/llvm-project#128566. See as well the discussion in llvm/llvm-project#132935.

Updating LLVM in order to pull in the following change: - llvm/llvm-project#128566 For context, crash reproduction generation in MLIR will run the `PassManager`'s passes in a child thread. The above PR fixes crashes for when passes such as `add_di_scope` add `DistinctAttr` to the IR and their storage is then accessed later once the child thread joins. Pulling this in improves QoL for out-of-tree projects and makes the pass manager more robust to the use of `DistinctAttr`. This pin update has also introduced the deprecation of a `llvm::TargetMachine::createTargetMachine` overload. I've updated the callsites to use the non-deprecated overloads. # New contributor declaration - [x] I am not making a trivial change, such as fixing a typo in a comment. - [x] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [x] This PR does not need a test because `this PR only updates the LLVM pin, so CI is sufficient`. - Select one of the following. - [x] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.)

Updating LLVM in order to pull in the following change: - llvm/llvm-project#128566 For context, crash reproduction generation in MLIR will run the `PassManager`'s passes in a child thread. The above PR fixes crashes for when passes such as `add_di_scope` add `DistinctAttr` to the IR and their storage is then accessed later once the child thread joins. Pulling this in improves QoL for out-of-tree projects and makes the pass manager more robust to the use of `DistinctAttr`. This pin update has also introduced the deprecation of a `llvm::TargetMachine::createTargetMachine` overload. I've updated the callsites to use the non-deprecated overloads. - [x] I am not making a trivial change, such as fixing a typo in a comment. - [x] I have written a PR description following these [rules](https://cbea.ms/git-commit/#why-not-how). - [x] I have run `pre-commit run --from-ref origin/main --to-ref HEAD`. - Select one of the following. - [ ] I have added tests. - `/test` for `lit` tests - `/unittest` for C++ tests - `/python/test` for end-to-end tests - [x] This PR does not need a test because `this PR only updates the LLVM pin, so CI is sufficient`. - Select one of the following. - [x] I have not added any `lit` tests. - [ ] The `lit` tests I have added follow these [best practices](https://mlir.llvm.org/getting_started/TestingGuide/#filecheck-best-practices), including the "tests should be minimal" section. (Usually running Python code and using the instructions it generates is not minimal.)

Use non thread-local allocator for DistinctAttr when threading is dis…

b4a909e

…abled

llvmbot added mlir:core MLIR Core Infrastructure mlir:llvm mlir labels Feb 24, 2025

gysit reviewed Feb 25, 2025

View reviewed changes

gysit requested a review from River707 February 25, 2025 10:40

abulavin added 2 commits February 28, 2025 15:28

fixup: Disabel threadlocal storage to handle multithreaded crash repr…

55a7a38

…oducer generation

fixup: add tests with multithreading enabled

8ace71b

gysit reviewed Mar 1, 2025

View reviewed changes

mlir/lib/IR/MLIRContext.cpp Outdated Show resolved Hide resolved

mlir/lib/Pass/PassCrashRecovery.cpp Outdated Show resolved Hide resolved

abulavin added 2 commits March 3, 2025 18:06

fixup: use disableMultiThreading in MLIRContext::disableMultiThreading

d611dba

fixup: use llvm::make_scope_exit instead of custom RAII guard

3792201

gysit reviewed Mar 3, 2025

View reviewed changes

mlir/lib/Pass/PassCrashRecovery.cpp Outdated Show resolved Hide resolved

mlir/lib/IR/AttributeDetail.h Outdated Show resolved Hide resolved

mlir/lib/IR/AttributeDetail.h Outdated Show resolved Hide resolved

abulavin added 2 commits March 4, 2025 23:39

fixup: use MLIRContext* instead of auto*

68843d5

fixup: replace unreachable with clearer logic in allocate method

8baecc9

gysit approved these changes Mar 5, 2025

View reviewed changes

mlir/lib/IR/AttributeDetail.h Outdated Show resolved Hide resolved

mlir/lib/IR/AttributeDetail.h Outdated Show resolved Hide resolved

fixup: update outdated docs comments

e7fd522

gysit requested review from zero9178 and Mogball March 5, 2025 18:05

River707 reviewed Mar 5, 2025

View reviewed changes

abulavin added 2 commits March 6, 2025 16:50

fixup: note that disabling threading implicitly disables thread-local…

7fffb59

… storage

fixup: remove unused error handling header

da4e176

River707 approved these changes Mar 12, 2025

View reviewed changes

abulavin changed the title ~~[mlir] Use non thread-local allocator for DistinctAttr when threading is disabled~~ [mlir] Fix DistinctAttributeUniquer deleting storage when crash reproduction is enabled Mar 13, 2025

abulavin changed the title ~~[mlir] Fix DistinctAttributeUniquer deleting storage when crash reproduction is enabled~~ [mlir] Fix DistinctAttributeUniquer deleting attribute storage Mar 13, 2025

abulavin changed the title ~~[mlir] Fix DistinctAttributeUniquer deleting attribute storage~~ [mlir] Fix DistinctAttributeUniquer deleting attribute storage when crash reproduction is enabled Mar 13, 2025

gysit merged commit 0aa5ba4 into llvm:main Mar 13, 2025
7 checks passed

This was referenced Mar 18, 2025

[TEST] Refactor reproducer test to be device-agnostic triton-lang/triton#6230

Merged

Bump to llvm/llvm-project@0aa5ba4 triton-lang/triton#6266

Merged

cota mentioned this pull request Mar 25, 2025

[mlir] DistinctAttributeAllocator: fix race #132935

Closed

cota mentioned this pull request Mar 25, 2025

Revert "[mlir] Fix DistinctAttributeUniquer deleting attribute storage when crash reproduction is enabled" #133000

Merged

cota added a commit that referenced this pull request Mar 25, 2025

Revert "[mlir] Fix DistinctAttributeUniquer deleting attribute storag…

2da4ce8

…e when crash reproduction is enabled" (#133000) Reverts #128566. See as well the discussion in #132935.

[mlir] Fix DistinctAttributeUniquer deleting attribute storage when crash reproduction is enabled #128566

[mlir] Fix DistinctAttributeUniquer deleting attribute storage when crash reproduction is enabled #128566

Uh oh!

Conversation

abulavin commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Feb 24, 2025

Uh oh!

llvmbot commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Feb 24, 2025

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

joker-eph commented Feb 25, 2025

Uh oh!

abulavin commented Feb 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abulavin commented Feb 28, 2025

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gysit left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

abulavin commented Mar 5, 2025

Uh oh!

gysit commented Mar 5, 2025

Uh oh!

River707 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

River707 Mar 5, 2025

Choose a reason for hiding this comment

Uh oh!

abulavin Mar 6, 2025

Choose a reason for hiding this comment

Uh oh!

abulavin commented Mar 6, 2025

Uh oh!

abulavin commented Mar 12, 2025

Uh oh!

gysit commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abulavin commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 13, 2025

Uh oh!

cota commented Mar 25, 2025

Uh oh!

cota commented Mar 25, 2025

Uh oh!

Uh oh!

abulavin commented Feb 24, 2025 •

edited

Loading

llvmbot commented Feb 24, 2025 •

edited

Loading

abulavin commented Feb 25, 2025 •

edited

Loading

gysit commented Mar 13, 2025 •

edited

Loading

abulavin commented Mar 13, 2025 •

edited

Loading