Skip to content

[msan] Dynamically grow kNumStackOriginDescrs #92826

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

thurstond
Copy link
Contributor

@thurstond thurstond commented May 20, 2024

StackOriginDescr/StackOriginPC are currently fixed size arrays
and have a FIXME of resizability. This patch dynamically doubles
the size of these arrays as needed (for runtime that is amortized linear
in the size of the array).

Since NumStackOriginDescrs (the number of items currently stored in the array)
is monotonically increasing, there is no need to ever decrease the
allocated size of the array.

Testing notes: the old fixed limit was 1 million variables; this would be an
extremely large test case. By manually capping the maximum size, I found
that param_tls_limit.cpp and compress_stack_depot.cpp are the tests that are most sensitive to
kNumStackOriginDescrs.

StackOriginDescr/StackOriginPC are currently fixed size arrays
and have a FIXME of resizability. This patch dynamically doubles
the size of these arrays as needed (for runtime that is amortized linear
in the size of the array).

Since NumStackOriginDescrs (the number of items currently stored in the array)
is monotonically increasing, there is no need to ever decrease the
allocated size of the array.

Testing notes: the old fixed limit was 1 million variables; this would be an
extremely large test case. By manually capping the maximum size, I found
that param_tls_limit.cpp is the test that is most sensitive to
kNumStackOriginDescrs.
@llvmbot
Copy link
Member

llvmbot commented May 20, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Thurston Dang (thurstond)

Changes

StackOriginDescr/StackOriginPC are currently fixed size arrays
and have a FIXME of resizability. This patch dynamically doubles
the size of these arrays as needed (for runtime that is amortized linear
in the size of the array).

Since NumStackOriginDescrs (the number of items currently stored in the array)
is monotonically increasing, there is no need to ever decrease the
allocated size of the array.

Testing notes: the old fixed limit was 1 million variables; this would be an
extremely large test case. By manually capping the maximum size, I found
that param_tls_limit.cpp is the test that is most sensitive to
kNumStackOriginDescrs.


Full diff: https://github.com/llvm/llvm-project/pull/92826.diff

1 Files Affected:

  • (modified) compiler-rt/lib/msan/msan.cpp (+25-4)
diff --git a/compiler-rt/lib/msan/msan.cpp b/compiler-rt/lib/msan/msan.cpp
index a2fc27de1901b..e66b5145f83de 100644
--- a/compiler-rt/lib/msan/msan.cpp
+++ b/compiler-rt/lib/msan/msan.cpp
@@ -99,10 +99,10 @@ bool msan_init_is_running;
 int msan_report_count = 0;
 
 // Array of stack origins.
-// FIXME: make it resizable.
-static const uptr kNumStackOriginDescrs = 1024 * 1024;
-static const char *StackOriginDescr[kNumStackOriginDescrs];
-static uptr StackOriginPC[kNumStackOriginDescrs];
+// This is resized dynamically by ResizeStackOriginMetadata
+static uptr kNumStackOriginDescrs = 0;
+static char **StackOriginDescr = nullptr;
+static uptr *StackOriginPC = nullptr;
 static atomic_uint32_t NumStackOriginDescrs;
 
 void Flags::SetDefaults() {
@@ -282,7 +282,27 @@ void ScopedThreadLocalStateBackup::Restore() {
 void UnpoisonThreadLocalState() {
 }
 
+// Grow stack origin metadata structures exponentially if out of space
+static void ResizeStackOriginMetadata(u32 id) {
+  if (id >= kNumStackOriginDescrs) {
+    if (kNumStackOriginDescrs == 0)
+      kNumStackOriginDescrs = 1;
+    else
+      kNumStackOriginDescrs = kNumStackOriginDescrs * 2;
+
+    VReport(2, "Resized stack origin metadata to %lu entries\n",
+            kNumStackOriginDescrs);
+  }
+
+  StackOriginDescr = (char **)InternalRealloc(
+      StackOriginDescr, kNumStackOriginDescrs * sizeof(char *));
+  StackOriginPC = (uptr *)InternalRealloc(StackOriginPC,
+                                          kNumStackOriginDescrs * sizeof(uptr));
+  // No checks needed - InternalRealloc() will die if out of memory
+}
+
 const char *GetStackOriginDescr(u32 id, uptr *pc) {
+  ResizeStackOriginMetadata(id);
   CHECK_LT(id, kNumStackOriginDescrs);
   if (pc) *pc = StackOriginPC[id];
   return StackOriginDescr[id];
@@ -312,6 +332,7 @@ static inline void SetAllocaOrigin(void *a, uptr size, u32 *id_ptr, char *descr,
   u32 id = *id_ptr;
   if (id == 0 || id == first_timer) {
     u32 idx = atomic_fetch_add(&NumStackOriginDescrs, 1, memory_order_relaxed);
+    ResizeStackOriginMetadata(idx);
     CHECK_LT(idx, kNumStackOriginDescrs);
     StackOriginDescr[idx] = descr;
     StackOriginPC[idx] = pc;

@thurstond thurstond requested a review from kstoimenov May 20, 2024 21:48
Copy link
Contributor

@eugenis eugenis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread safe. I do not think you can do realloc() if you want the mutation to be lock free.

We could simply increase the constant size of the buffer (not perfect, but it is just .bss, not real memory).

We could go with an array of pointers to exponentially increasing allocations (i.e. same as your case, but never deallocate, instead just allocate more and consider the container a logical concatenation of multiple buffers).

thurstond added a commit to thurstond/llvm-project that referenced this pull request May 20, 2024
This increases the constant size of kNumStackOriginDescrs to
64M (1GB of BSS across two arrays), which ought to be enough for
anybody.

This is the easier alternative suggested by eugenis@ in
llvm#92826.
@thurstond
Copy link
Contributor Author

This is not thread safe. I do not think you can do realloc() if you want the mutation to be lock free.

Thanks for the review, I had overlooked thread safety.

We could simply increase the constant size of the buffer (not perfect, but it is just .bss, not real memory).

This is implemented in #92838

We could go with an array of pointers to exponentially increasing allocations (i.e. same as your case, but never deallocate, instead just allocate more and consider the container a logical concatenation of multiple buffers).

This is still tricky to implement in a thread-safe, lock-free manner, because there are two arrays to be grown (StackOriginDescrs, StackOriginPC).

@thurstond thurstond closed this May 21, 2024
@eugenis
Copy link
Contributor

eugenis commented May 21, 2024

We could go with an array of pointers to exponentially increasing allocations (i.e. same as your case, but never deallocate, instead just allocate more and consider the container a logical concatenation of multiple buffers).

This is still tricky to implement in a thread-safe, lock-free manner, because there are two arrays to be grown (StackOriginDescrs, StackOriginPC).

I think it will actually work pretty well.

If subarrays are power of 2 sized (1, 2, 4, 8, 16, ...), then the subarray for index i is basically log2(i+1). Both arrays can be handled independently; once the index is acquired, the subarray pointer is initialized with a double-checked locking pattern.

@eugenis
Copy link
Contributor

eugenis commented May 21, 2024

But then, it seems easier to just MAP_NORESERVE a large enough array in msan_init. It can be really large then, no concerns about relocation offset.

thurstond added a commit that referenced this pull request May 21, 2024
This increases the constant size of kNumStackOriginDescrs to 4M (64GB of
BSS across two arrays), which ought to be enough for anybody.

This is the easier alternative suggested by eugenis@ in
#92826.
thurstond added a commit that referenced this pull request May 28, 2024
…erPC) (#93117)

The original pull request
(#92838) was reverted due to a
PowerPC buildbot breakage
(df626dd).
This reland limits the scope of the change to non-PowerPC platforms. I
am unaware of any PowerPC use cases that would benefit from a larger
kNumStackOriginDescrs constant.

Original CL description: This increases the constant size of
kNumStackOriginDescrs to 4M (64GB of BSS across two arrays), which ought
to be enough for anybody.

This is the easier alternative suggested by eugenis@ in
#92826.
vg0204 pushed a commit to vg0204/llvm-project that referenced this pull request May 29, 2024
…erPC) (llvm#93117)

The original pull request
(llvm#92838) was reverted due to a
PowerPC buildbot breakage
(llvm@df626dd).
This reland limits the scope of the change to non-PowerPC platforms. I
am unaware of any PowerPC use cases that would benefit from a larger
kNumStackOriginDescrs constant.

Original CL description: This increases the constant size of
kNumStackOriginDescrs to 4M (64GB of BSS across two arrays), which ought
to be enough for anybody.

This is the easier alternative suggested by eugenis@ in
llvm#92826.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants