Skip to content

Commit 99209b2

Browse files
committed
[libc++] [sort] Improve performance of std::sort
Let in a wider variety of element types that are eligible to be processed by branchless sort functions, like POD structs and pointers. This change causes `sort_stability.pass.cpp` to prefer branchless sort for `EqualType`, but not if the comparator is an arbitrary lambda. So the test is adjusted to always pass a simple comparator. `nth_element_stability.pass.cpp` and `partial_sort_stability.pass.cpp` already use a simple comparator all the time, so they don't need to be adjusted. We considered forbidding the branchless version for unintentionally expensive comparators, e.g. double a[10]; std::sort(a, a+10, std::less<BigDouble>()); Derived *a[10]; std::sort(a, a+10, std::less<VirtualBase*>()); In both cases, it is safe, but more expensive than the original PR intended (since the comparator will be invoked more often in the branchless version). However, it's invoked only 0.1% more often, and benchmarks show that the branchless version is still (much) faster despite the extra invocations. So we leave this alone. Sample benchmarks: - https://godbolt.org/z/x1Waz3vbY - https://godbolt.org/z/5Ph3hfqbc Sanity-check that "branchless sort" is used only for trivially copyable types It relies on copying the elements, so copying must be at least possible, and ideally no more expensive than move; i.e., trivial. No functional change.
1 parent f7f7574 commit 99209b2

File tree

2 files changed

+7
-4
lines changed

2 files changed

+7
-4
lines changed

libcxx/include/__algorithm/sort.h

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,8 @@
2828
#include <__iterator/iterator_traits.h>
2929
#include <__type_traits/conditional.h>
3030
#include <__type_traits/disjunction.h>
31-
#include <__type_traits/is_arithmetic.h>
3231
#include <__type_traits/is_constant_evaluated.h>
32+
#include <__type_traits/is_trivially_copyable.h>
3333
#include <__utility/move.h>
3434
#include <__utility/pair.h>
3535
#include <climits>
@@ -144,7 +144,7 @@ template <class _Compare, class _Iter, class _Tp = typename iterator_traits<_Ite
144144
using __use_branchless_sort =
145145
integral_constant<bool,
146146
__libcpp_is_contiguous_iterator<_Iter>::value && sizeof(_Tp) <= sizeof(void*) &&
147-
is_arithmetic<_Tp>::value && __is_simple_comparator<_Compare>::value>;
147+
is_trivially_copyable<_Tp>::value && __is_simple_comparator<_Compare>::value>;
148148

149149
namespace __detail {
150150

@@ -158,6 +158,8 @@ template <class _Compare, class _RandomAccessIterator>
158158
inline _LIBCPP_HIDE_FROM_ABI void __cond_swap(_RandomAccessIterator __x, _RandomAccessIterator __y, _Compare __c) {
159159
// Note: this function behaves correctly even with proxy iterators (because it relies on `value_type`).
160160
using value_type = typename iterator_traits<_RandomAccessIterator>::value_type;
161+
static_assert(is_trivially_copyable<value_type>::value,
162+
"Copying must have no side effects distinguishable from swapping/moving");
161163
bool __r = __c(*__x, *__y);
162164
value_type __tmp = __r ? *__x : *__y;
163165
*__y = __r ? *__y : *__x;
@@ -171,6 +173,8 @@ inline _LIBCPP_HIDE_FROM_ABI void
171173
__partially_sorted_swap(_RandomAccessIterator __x, _RandomAccessIterator __y, _RandomAccessIterator __z, _Compare __c) {
172174
// Note: this function behaves correctly even with proxy iterators (because it relies on `value_type`).
173175
using value_type = typename iterator_traits<_RandomAccessIterator>::value_type;
176+
static_assert(is_trivially_copyable<value_type>::value,
177+
"Copying must have no side effects distinguishable from swapping/moving");
174178
bool __r = __c(*__z, *__x);
175179
value_type __tmp = __r ? *__z : *__x;
176180
*__z = __r ? *__x : *__z;

libcxx/test/libcxx/algorithms/sort_stability.pass.cpp

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,8 +68,7 @@ void test_same() {
6868
auto snapshot_custom_v = v;
6969
std::sort(v.begin(), v.end());
7070
std::sort(snapshot_v.begin(), snapshot_v.end());
71-
std::sort(snapshot_custom_v.begin(), snapshot_custom_v.end(),
72-
[](const EqualType&, const EqualType&) { return false; });
71+
std::sort(snapshot_custom_v.begin(), snapshot_custom_v.end(), std::less<EqualType>());
7372
bool all_equal = true;
7473
for (int i = 0; i < kSize; ++i) {
7574
if (v[i].value != snapshot_v[i].value || v[i].value != snapshot_custom_v[i].value) {

0 commit comments

Comments
 (0)