Skip to content

[libc++] Fix shrink_to_fit to swap buffer only when capacity is strictly smaller #127321

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 22, 2025

Conversation

winner245
Copy link
Contributor

The current implementation of the shrink_to_fit() function of basic_string swaps to the newly allocated buffer when the new buffer has the same capacity as the existing one. While this is not incorrect, it is truly unnecessary to swap to an equally-sized buffer. With equal capacity, we should keep using the existing buffer and simply deallocate the new one, avoiding the extra work of copying elements.

The desired behavior was documented in the following comment within the function:

// The Standard mandates shrink_to_fit() does not increase the capacity.
// With equal capacity keep the existing buffer. This avoids extra work
// due to swapping the elements.
if (__allocation.count - 1 > capacity()) {
__alloc_traits::deallocate(__alloc_, __allocation.ptr, __allocation.count);
return;
}

However, the existing implementation did not exactly conform to this guideline, which is a QoI matter.

This PR modifies the shrink_to_fit() function to ensure that the buffer is only swapped when the new allocation is strictly smaller than the existing one. When the capacities are equal, the new buffer will be discarded without copying the elements. This is achieved by including the == check in the above conditional logic.

@winner245 winner245 added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Feb 15, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 15, 2025

@llvm/pr-subscribers-libcxx

Author: Peng Liu (winner245)

Changes

The current implementation of the shrink_to_fit() function of basic_string swaps to the newly allocated buffer when the new buffer has the same capacity as the existing one. While this is not incorrect, it is truly unnecessary to swap to an equally-sized buffer. With equal capacity, we should keep using the existing buffer and simply deallocate the new one, avoiding the extra work of copying elements.

The desired behavior was documented in the following comment within the function:

// The Standard mandates shrink_to_fit() does not increase the capacity.
// With equal capacity keep the existing buffer. This avoids extra work
// due to swapping the elements.
if (__allocation.count - 1 > capacity()) {
__alloc_traits::deallocate(__alloc_, __allocation.ptr, __allocation.count);
return;
}

However, the existing implementation did not exactly conform to this guideline, which is a QoI matter.

This PR modifies the shrink_to_fit() function to ensure that the buffer is only swapped when the new allocation is strictly smaller than the existing one. When the capacities are equal, the new buffer will be discarded without copying the elements. This is achieved by including the == check in the above conditional logic.


Full diff: https://github.com/llvm/llvm-project/pull/127321.diff

1 Files Affected:

  • (modified) libcxx/include/string (+1-1)
diff --git a/libcxx/include/string b/libcxx/include/string
index b280f5f458459..d2fe6c0847bd1 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -3560,7 +3560,7 @@ inline _LIBCPP_CONSTEXPR_SINCE_CXX20 void basic_string<_CharT, _Traits, _Allocat
     // The Standard mandates shrink_to_fit() does not increase the capacity.
     // With equal capacity keep the existing buffer. This avoids extra work
     // due to swapping the elements.
-    if (__allocation.count - 1 > capacity()) {
+    if (__allocation.count - 1 >= capacity()) {
       __alloc_traits::deallocate(__alloc_, __allocation.ptr, __allocation.count);
       return;
     }

@winner245 winner245 marked this pull request as ready for review February 15, 2025 14:09
@winner245 winner245 requested a review from a team as a code owner February 15, 2025 14:09
Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a libc++-specific test to make sure we don't regress on this.

@winner245 winner245 force-pushed the fix-shrink-to-fit branch 2 times, most recently from 3705072 to 7ec607a Compare February 16, 2025 21:30
Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the comments addressed.

Comment on lines 78 to 103
{ // Ensure that the libc++ implementation of shrink_to_fit does NOT swap buffer with equal allocation sizes
{ // Test with custom allocator with a minimum allocation size
std::basic_string<char, std::char_traits<char>, min_size_allocator<128, char> > s(
"A long string exceeding SSO limit but within min alloc size");
std::size_t capacity = s.capacity();
std::size_t size = s.size();
auto data = s.data();
s.shrink_to_fit();
assert(s.capacity() <= capacity);
assert(s.size() == size);
LIBCPP_ASSERT(is_string_asan_correct(s));
LIBCPP_ASSERT(s.capacity() == capacity && s.data() == data);
}
{ // Test with custom allocator with a minimum power of two allocation size
std::basic_string<char, std::char_traits<char>, pow2_allocator<char> > s(
"This is a long string that exceeds the SSO limit");
std::size_t capacity = s.capacity();
std::size_t size = s.size();
auto data = s.data();
s.shrink_to_fit();
assert(s.capacity() <= capacity);
assert(s.size() == size);
LIBCPP_ASSERT(is_string_asan_correct(s));
LIBCPP_ASSERT(s.capacity() == capacity && s.data() == data);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move these tests to libcxx/test/libcxx/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp and replace the LIBCPP_ASSERT with normal asserts. I don't think we're really adding coverage for anything that's not libc++.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestions. I've moved the tests to the libc++ specific location libcxx/test/libcxx/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp, and made the necessary changes there. Also, I've removed allocate_at_least and only kept allocate, as they essentially does the same thing.

I hope you could provide me with some further advices on better integration of my tests with the shrink_to_fit tests already there. Currently, there is a test there addressing an issue where the allocation would be unexpectedly discarded even if it is smaller than the current capacity. This test is in a separate function test_oversizing_allocator(). Since we are moving new tests to the same file, I am wondering if it is a good idea to integrate these tests into a single top-level function called test()? This would allow us to perform runtime assertion as well as static assertion as we usually do as follows:

TEST_CONSTEXPR_CXX20 bool test() {
  // (The existing test there) 
  // Make sure we use an allocation returned by allocate_at_least if it is smaller 
  // than the current allocation even if it contains more bytes than we requested.
  // Fix issue: https://github.com/llvm/llvm-project/pull/115659
  {
    ...
  }

  // (My new tests moved over) 
  // Ensure that the libc++ implementation of shrink_to_fit does NOT swap buffer 
  // with equal allocation sizes
  {
    ...
  }
}

Then we can do our usual static_assert in one place:

  test();
#if TEST_STD_VER > 17
  static_assert(test());
#endif

Moreover, do you think it is a good idea to move the oversizing_allocator class definition to a common location containing the definitions of all these special allocators (e.g., min_size_allocator, pow2_allocator, increasing_allocator). These allocators are exclusively used for shrink_to_fit tests. We may move them into increasing_allocator.h and then rename the file to something like test_allocators_for_shrink_to_fit?

I would appreciate your input on this. Thanks again for your guidance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I received an ASAN failure due to new-delete-type-mismatch because allocate(n) allocates more than n, whereas deallocate(p, n) receives a smaller n (because allocate() is unable to report the actual allocation size like allocate_at_least() does). Given this, I have to implement the allocators using operator new and operator delete instead of delegating to std::allocator<T>::allocate() and std::allocator<T>::deallocate(). This change should fix the ASAN CI failure.

@philnik777 philnik777 merged commit 31824b2 into llvm:main Feb 22, 2025
82 checks passed
@winner245 winner245 deleted the fix-shrink-to-fit branch February 22, 2025 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants