Skip to content

Commit 062af36

Browse files
committed
[libc++] Take advantage of trivial relocation in std::vector::erase
In vector::erase(iter) and vector::erase(iter, iter), we can take advantage of a type being trivially relocatable to open up a gap in the vector and then relocate the tail of the vector into that gap. The benefit is that relocating an object is often more efficient than move-assigning and then destroying the original object. For types that can be relocated trivially but that are complicated enough for the compiler not to optimize by itself (like std::string), this provides around a 2x performance speedup in vector::erase (see below). This optimization requires stopping the usage of Clang's __is_trivially_relocatable builtin, which doesn't currently honour assignment operators like is_trivially_copyable does and can lead us to perform incorrect optimizations. It is also worth noting that __uninitialized_allocator_relocate has to be modified so that we can relocate into an overlapping range. This has an unfortunate impact on its exception safety guarantees, which needs to be investigated further. Previous implementation -------------------------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------------------------- BM_erase_iter_in_middle/vector_int/1024 24.9 ns 24.9 ns 28042962 BM_erase_iter_in_middle/vector_int/4096 107 ns 107 ns 6590592 BM_erase_iter_in_middle/vector_int/10240 271 ns 265 ns 2733478 BM_erase_iter_in_middle/vector_string/1024 349 ns 349 ns 2005886 BM_erase_iter_in_middle/vector_string/4096 1410 ns 1406 ns 498355 BM_erase_iter_in_middle/vector_string/10240 3449 ns 3449 ns 201989 BM_erase_iter_at_start/vector_int/1024 47.1 ns 47.1 ns 14836261 BM_erase_iter_at_start/vector_int/4096 204 ns 204 ns 3430414 BM_erase_iter_at_start/vector_int/10240 504 ns 504 ns 1391373 BM_erase_iter_at_start/vector_string/1024 684 ns 684 ns 1025160 BM_erase_iter_at_start/vector_string/4096 2855 ns 2806 ns 254080 BM_erase_iter_at_start/vector_string/10240 7060 ns 7060 ns 94134 New implementation -------------------------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------------------------- BM_erase_iter_in_middle/vector_int/1024 26.0 ns 25.9 ns 27127367 BM_erase_iter_in_middle/vector_int/4096 105 ns 105 ns 6515204 BM_erase_iter_in_middle/vector_int/10240 259 ns 258 ns 2800795 BM_erase_iter_in_middle/vector_string/1024 148 ns 147 ns 4725706 BM_erase_iter_in_middle/vector_string/4096 608 ns 606 ns 1168205 BM_erase_iter_in_middle/vector_string/10240 1523 ns 1520 ns 459909 BM_erase_iter_at_start/vector_int/1024 47.1 ns 47.1 ns 14762513 BM_erase_iter_at_start/vector_int/4096 205 ns 205 ns 3403130 BM_erase_iter_at_start/vector_int/10240 507 ns 507 ns 1382716 BM_erase_iter_at_start/vector_string/1024 300 ns 300 ns 2327546 BM_erase_iter_at_start/vector_string/4096 1205 ns 1205 ns 580855 BM_erase_iter_at_start/vector_string/10240 4296 ns 4296 ns 162956
1 parent 7f8c872 commit 062af36

File tree

7 files changed

+106
-67
lines changed

7 files changed

+106
-67
lines changed

libcxx/docs/ReleaseNotes/20.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,9 @@ Improvements and New Features
5252
- The ``lexicographical_compare`` and ``ranges::lexicographical_compare`` algorithms have been optimized for trivially
5353
equality comparable types, resulting in a performance improvement of up to 40x.
5454

55+
- The ``std::vector::erase`` function has been optimized for types that can be relocated trivially (such as ``std::string``),
56+
yielding speed ups witnessed to be around 2x for these types (but subject to the use case).
57+
5558
- The ``_LIBCPP_ENABLE_CXX20_REMOVED_TEMPORARY_BUFFER`` macro has been added to make ``std::get_temporary_buffer`` and
5659
``std::return_temporary_buffer`` available.
5760

libcxx/include/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -551,6 +551,7 @@ set(files
551551
__memory/construct_at.h
552552
__memory/destruct_n.h
553553
__memory/inout_ptr.h
554+
__memory/is_trivially_allocator_relocatable.h
554555
__memory/noexcept_move_assign_container.h
555556
__memory/out_ptr.h
556557
__memory/pointer_traits.h
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#ifndef _LIBCPP___MEMORY_IS_TRIVIALLY_ALLOCATOR_RELOCATABLE_H
10+
#define _LIBCPP___MEMORY_IS_TRIVIALLY_ALLOCATOR_RELOCATABLE_H
11+
12+
#include <__config>
13+
#include <__memory/allocator_traits.h>
14+
#include <__type_traits/integral_constant.h>
15+
#include <__type_traits/is_trivially_relocatable.h>
16+
#include <__type_traits/negation.h>
17+
18+
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
19+
# pragma GCC system_header
20+
#endif
21+
22+
_LIBCPP_BEGIN_NAMESPACE_STD
23+
24+
// A type is trivially allocator relocatable if the allocator's move construction and destruction
25+
// don't do anything beyond calling the type's move constructor and destructor, and if the type
26+
// itself is trivially relocatable.
27+
28+
template <class _Alloc, class _Type>
29+
struct __allocator_has_trivial_move_construct : _Not<__has_construct<_Alloc, _Type*, _Type&&> > {};
30+
31+
template <class _Type>
32+
struct __allocator_has_trivial_move_construct<allocator<_Type>, _Type> : true_type {};
33+
34+
template <class _Alloc, class _Tp>
35+
struct __allocator_has_trivial_destroy : _Not<__has_destroy<_Alloc, _Tp*> > {};
36+
37+
template <class _Tp, class _Up>
38+
struct __allocator_has_trivial_destroy<allocator<_Tp>, _Up> : true_type {};
39+
40+
template <class _Alloc, class _Tp>
41+
struct __is_trivially_allocator_relocatable
42+
: integral_constant<bool,
43+
__allocator_has_trivial_move_construct<_Alloc, _Tp>::value &&
44+
__allocator_has_trivial_destroy<_Alloc, _Tp>::value &&
45+
__libcpp_is_trivially_relocatable<_Tp>::value> {};
46+
47+
_LIBCPP_END_NAMESPACE_STD
48+
49+
#endif // _LIBCPP___MEMORY_IS_TRIVIALLY_ALLOCATOR_RELOCATABLE_H

libcxx/include/__memory/uninitialized_algorithms.h

Lines changed: 16 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
#include <__memory/addressof.h>
2121
#include <__memory/allocator_traits.h>
2222
#include <__memory/construct_at.h>
23+
#include <__memory/is_trivially_allocator_relocatable.h>
2324
#include <__memory/pointer_traits.h>
2425
#include <__type_traits/enable_if.h>
2526
#include <__type_traits/extent.h>
@@ -591,60 +592,38 @@ __uninitialized_allocator_copy(_Alloc& __alloc, _Iter1 __first1, _Sent1 __last1,
591592
return std::__rewrap_iter(__first2, __result);
592593
}
593594

594-
template <class _Alloc, class _Type>
595-
struct __allocator_has_trivial_move_construct : _Not<__has_construct<_Alloc, _Type*, _Type&&> > {};
596-
597-
template <class _Type>
598-
struct __allocator_has_trivial_move_construct<allocator<_Type>, _Type> : true_type {};
599-
600-
template <class _Alloc, class _Tp>
601-
struct __allocator_has_trivial_destroy : _Not<__has_destroy<_Alloc, _Tp*> > {};
602-
603-
template <class _Tp, class _Up>
604-
struct __allocator_has_trivial_destroy<allocator<_Tp>, _Up> : true_type {};
605-
606595
// __uninitialized_allocator_relocate relocates the objects in [__first, __last) into __result.
596+
//
607597
// Relocation means that the objects in [__first, __last) are placed into __result as-if by move-construct and destroy,
608598
// except that the move constructor and destructor may never be called if they are known to be equivalent to a memcpy.
609599
//
610-
// Preconditions: __result doesn't contain any objects and [__first, __last) contains objects
600+
// This algorithm works even if part of the resulting range overlaps with [__first, __last), as long as __result itself
601+
// is not in [__first, last).
602+
//
603+
// Preconditions:
604+
// - __result doesn't contain any objects and [__first, __last) contains objects
605+
// - __result is not in [__first, __last) sd
611606
// Postconditions: __result contains the objects from [__first, __last) and
612607
// [__first, __last) doesn't contain any objects
613-
//
614-
// The strong exception guarantee is provided if any of the following are true:
615-
// - is_nothrow_move_constructible<_ValueType>
616-
// - is_copy_constructible<_ValueType>
617-
// - __libcpp_is_trivially_relocatable<_ValueType>
618608
template <class _Alloc, class _ContiguousIterator>
619609
_LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 void __uninitialized_allocator_relocate(
620610
_Alloc& __alloc, _ContiguousIterator __first, _ContiguousIterator __last, _ContiguousIterator __result) {
621611
static_assert(__libcpp_is_contiguous_iterator<_ContiguousIterator>::value, "");
622612
using _ValueType = typename iterator_traits<_ContiguousIterator>::value_type;
623613
static_assert(__is_cpp17_move_insertable<_Alloc>::value,
624614
"The specified type does not meet the requirements of Cpp17MoveInsertable");
625-
if (__libcpp_is_constant_evaluated() || !__libcpp_is_trivially_relocatable<_ValueType>::value ||
626-
!__allocator_has_trivial_move_construct<_Alloc, _ValueType>::value ||
627-
!__allocator_has_trivial_destroy<_Alloc, _ValueType>::value) {
628-
auto __destruct_first = __result;
629-
auto __guard = std::__make_exception_guard(
630-
_AllocatorDestroyRangeReverse<_Alloc, _ContiguousIterator>(__alloc, __destruct_first, __result));
631-
auto __iter = __first;
632-
while (__iter != __last) {
633-
#if _LIBCPP_HAS_EXCEPTIONS
634-
allocator_traits<_Alloc>::construct(__alloc, std::__to_address(__result), std::move_if_noexcept(*__iter));
635-
#else
636-
allocator_traits<_Alloc>::construct(__alloc, std::__to_address(__result), std::move(*__iter));
637-
#endif
638-
++__iter;
615+
if (__libcpp_is_constant_evaluated() || !__is_trivially_allocator_relocatable<_Alloc, _ValueType>::value) {
616+
while (__first != __last) {
617+
allocator_traits<_Alloc>::construct(__alloc, std::__to_address(__result), std::move(*__first));
618+
allocator_traits<_Alloc>::destroy(__alloc, std::__to_address(__first));
619+
++__first;
639620
++__result;
640621
}
641-
__guard.__complete();
642-
std::__allocator_destroy(__alloc, __first, __last);
643622
} else {
644623
// Casting to void* to suppress clang complaining that this is technically UB.
645-
__builtin_memcpy(static_cast<void*>(std::__to_address(__result)),
646-
std::__to_address(__first),
647-
sizeof(_ValueType) * (__last - __first));
624+
__builtin_memmove(static_cast<void*>(std::__to_address(__result)),
625+
std::__to_address(__first),
626+
sizeof(_ValueType) * (__last - __first));
648627
}
649628
}
650629

libcxx/include/__type_traits/is_trivially_relocatable.h

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -23,14 +23,12 @@ _LIBCPP_BEGIN_NAMESPACE_STD
2323

2424
// A type is trivially relocatable if a move construct + destroy of the original object is equivalent to
2525
// `memcpy(dst, src, sizeof(T))`.
26-
27-
#if __has_builtin(__is_trivially_relocatable)
28-
template <class _Tp, class = void>
29-
struct __libcpp_is_trivially_relocatable : integral_constant<bool, __is_trivially_relocatable(_Tp)> {};
30-
#else
26+
//
27+
// Note that we don't use Clang's __is_trivially_relocatable builtin because it doesn't honor the presence
28+
// of non-trivial special members like assignment operators, or even a copy constructor, making it possible
29+
// to incorrectly optimize operations that should call user-provided operations instead.
3130
template <class _Tp, class = void>
3231
struct __libcpp_is_trivially_relocatable : is_trivially_copyable<_Tp> {};
33-
#endif
3432

3533
template <class _Tp>
3634
struct __libcpp_is_trivially_relocatable<_Tp,

libcxx/include/__vector/vector.h

Lines changed: 32 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
#include <__memory/allocator.h>
3535
#include <__memory/allocator_traits.h>
3636
#include <__memory/compressed_pair.h>
37+
#include <__memory/is_trivially_allocator_relocatable.h>
3738
#include <__memory/noexcept_move_assign_container.h>
3839
#include <__memory/pointer_traits.h>
3940
#include <__memory/swap_allocator.h>
@@ -515,8 +516,37 @@ class _LIBCPP_TEMPLATE_VIS vector {
515516
}
516517
#endif
517518

518-
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator erase(const_iterator __position);
519-
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator erase(const_iterator __first, const_iterator __last);
519+
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator erase(const_iterator __position) {
520+
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
521+
__position != end(), "vector::erase(iterator) called with a non-dereferenceable iterator");
522+
return erase(__position, __position + 1);
523+
}
524+
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator erase(const_iterator __cfirst, const_iterator __clast) {
525+
_LIBCPP_ASSERT_VALID_INPUT_RANGE(__cfirst <= __clast, "vector::erase(first, last) called with invalid range");
526+
527+
iterator __first = begin() + std::distance(cbegin(), __cfirst);
528+
iterator __last = begin() + std::distance(cbegin(), __clast);
529+
if (__first == __last)
530+
return __last;
531+
532+
auto __n = std::distance(__first, __last);
533+
534+
// When the value_type is trivially relocatable, we know that move-assignment followed by a destruction
535+
// is equivalent to a memcpy, and we can elide calls to the move-assignment operator (which are mandated
536+
// by the Standard) under the as-if rule. So instead, we destroy the range being erased and we relocate the
537+
// tail of the vector into the created gap.
538+
if (__is_trivially_allocator_relocatable<_Allocator, value_type>::value) {
539+
std::__allocator_destroy(this->__alloc_, __first, __last);
540+
std::__uninitialized_allocator_relocate(this->__alloc_, __last, end(), __first);
541+
} else {
542+
auto __new_end = std::move(__last, end(), __first);
543+
std::__allocator_destroy(this->__alloc_, __new_end, end());
544+
}
545+
546+
this->__end_ -= __n;
547+
__annotate_shrink(size() + __n);
548+
return __first;
549+
}
520550

521551
_LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void clear() _NOEXCEPT {
522552
size_type __old_size = size();
@@ -1125,28 +1155,6 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 inline
11251155
#endif
11261156
}
11271157

1128-
template <class _Tp, class _Allocator>
1129-
_LIBCPP_CONSTEXPR_SINCE_CXX20 inline _LIBCPP_HIDE_FROM_ABI typename vector<_Tp, _Allocator>::iterator
1130-
vector<_Tp, _Allocator>::erase(const_iterator __position) {
1131-
_LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
1132-
__position != end(), "vector::erase(iterator) called with a non-dereferenceable iterator");
1133-
difference_type __ps = __position - cbegin();
1134-
pointer __p = this->__begin_ + __ps;
1135-
this->__destruct_at_end(std::move(__p + 1, this->__end_, __p));
1136-
return __make_iter(__p);
1137-
}
1138-
1139-
template <class _Tp, class _Allocator>
1140-
_LIBCPP_CONSTEXPR_SINCE_CXX20 typename vector<_Tp, _Allocator>::iterator
1141-
vector<_Tp, _Allocator>::erase(const_iterator __first, const_iterator __last) {
1142-
_LIBCPP_ASSERT_VALID_INPUT_RANGE(__first <= __last, "vector::erase(first, last) called with invalid range");
1143-
pointer __p = this->__begin_ + (__first - begin());
1144-
if (__first != __last) {
1145-
this->__destruct_at_end(std::move(__p + (__last - __first), this->__end_, __p));
1146-
}
1147-
return __make_iter(__p);
1148-
}
1149-
11501158
template <class _Tp, class _Allocator>
11511159
_LIBCPP_CONSTEXPR_SINCE_CXX20 void
11521160
vector<_Tp, _Allocator>::__move_range(pointer __from_s, pointer __from_e, pointer __to) {

libcxx/include/module.modulemap

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1533,6 +1533,7 @@ module std [system] {
15331533
module destruct_n { header "__memory/destruct_n.h" }
15341534
module fwd { header "__fwd/memory.h" }
15351535
module inout_ptr { header "__memory/inout_ptr.h" }
1536+
module is_trivially_allocator_relocatable { header "__memory/is_trivially_allocator_relocatable.h" }
15361537
module noexcept_move_assign_container { header "__memory/noexcept_move_assign_container.h" }
15371538
module out_ptr { header "__memory/out_ptr.h" }
15381539
module pointer_traits { header "__memory/pointer_traits.h" }

0 commit comments

Comments
 (0)