Skip to content

[libc++] Optimizing is_permutation #129565

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

[libc++] Optimizing is_permutation #129565

wants to merge 6 commits into from

Conversation

imdj
Copy link
Contributor

@imdj imdj commented Mar 3, 2025

Optimize is_permutation by using std::find_if, std::count_if, and std::mismatch to replace the hand-written loops.

Solve:

@imdj imdj requested a review from a team as a code owner March 3, 2025 18:13
Copy link

github-actions bot commented Mar 3, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Mar 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 3, 2025

@llvm/pr-subscribers-libcxx

Author: Imad Aldij (imdj)

Changes

Optimize is_permutation by using std::find_if, std::count_if, and std::mismatch when possible.

Solve:


Full diff: https://github.com/llvm/llvm-project/pull/129565.diff

1 Files Affected:

  • (modified) libcxx/include/__algorithm/is_permutation.h (+23-20)
diff --git a/libcxx/include/__algorithm/is_permutation.h b/libcxx/include/__algorithm/is_permutation.h
index 1afb11596bc6b..c6cc947c75714 100644
--- a/libcxx/include/__algorithm/is_permutation.h
+++ b/libcxx/include/__algorithm/is_permutation.h
@@ -11,7 +11,10 @@
 #define _LIBCPP___ALGORITHM_IS_PERMUTATION_H
 
 #include <__algorithm/comp.h>
+#include <__algorithm/count_if.h>
+#include <__algorithm/find_if.h>
 #include <__algorithm/iterator_operations.h>
+#include <__algorithm/mismatch.h>
 #include <__config>
 #include <__functional/identity.h>
 #include <__iterator/concepts.h>
@@ -82,28 +85,29 @@ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 bool __is_permutation_impl(
 
   for (auto __i = __first1; __i != __last1; ++__i) {
     //  Have we already counted the number of *__i in [f1, l1)?
-    auto __match = __first1;
-    for (; __match != __i; ++__match) {
-      if (std::__invoke(__pred, std::__invoke(__proj1, *__match), std::__invoke(__proj1, *__i)))
-        break;
-    }
+    auto __match = std::find_if(__first1, __i, [&](const auto& __x) {
+      return bool(std::__invoke(__pred, std::__invoke(__proj1, __x), std::__invoke(__proj1, *__i)));
+    });
 
     if (__match == __i) {
+      auto __proj = std::__identity();
+
       // Count number of *__i in [f2, l2)
-      _D1 __c2 = 0;
-      for (auto __j = __first2; __j != __last2; ++__j) {
-        if (std::__invoke(__pred, std::__invoke(__proj1, *__i), std::__invoke(__proj2, *__j)))
-          ++__c2;
-      }
+      auto __predicate2 = [&](const auto& __x) {
+        return bool(std::__invoke(__pred, std::__invoke(__proj1, *__i), std::__invoke(__proj2, __x)));
+      };
+      _D1 __c2 = std::__count_if<_AlgPolicy>(__first2, __last2, __predicate2, __proj);
+
       if (__c2 == 0)
         return false;
 
-      // Count number of *__i in [__i, l1) (we can start with 1)
-      _D1 __c1 = 1;
-      for (auto __j = _IterOps<_AlgPolicy>::next(__i); __j != __last1; ++__j) {
-        if (std::__invoke(__pred, std::__invoke(__proj1, *__i), std::__invoke(__proj1, *__j)))
-          ++__c1;
-      }
+      // Count number of *__i in [__i, l1)
+      auto __predicate1 = [&](const auto& __x) {
+        return bool(std::__invoke(__pred, std::__invoke(__proj1, *__i), std::__invoke(__proj1, __x)));
+      };
+      _D1 __c1 = std::__count_if<_AlgPolicy>(_IterOps<_AlgPolicy>::next(__i), __last1, __predicate1, __proj);
+      ++__c1; // Add 1 for *__i itself
+
       if (__c1 != __c2)
         return false;
     }
@@ -117,10 +121,9 @@ template <class _AlgPolicy, class _ForwardIterator1, class _Sentinel1, class _Fo
 [[__nodiscard__]] _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20 bool __is_permutation(
     _ForwardIterator1 __first1, _Sentinel1 __last1, _ForwardIterator2 __first2, _BinaryPredicate&& __pred) {
   // Shorten sequences as much as possible by lopping of any equal prefix.
-  for (; __first1 != __last1; ++__first1, (void)++__first2) {
-    if (!__pred(*__first1, *__first2))
-      break;
-  }
+  auto __result = std::mismatch(__first1, __last1, __first2, __pred);
+  __first1      = __result.first;
+  __first2      = __result.second;
 
   if (__first1 == __last1)
     return true;

@imdj imdj marked this pull request as draft March 3, 2025 18:38
@imdj imdj closed this Mar 4, 2025
@imdj imdj reopened this Mar 4, 2025
@imdj imdj force-pushed the main branch 3 times, most recently from ba40d2e to 471c691 Compare March 4, 2025 15:34
@imdj imdj closed this Mar 4, 2025
@imdj imdj reopened this Mar 6, 2025
@mordante mordante self-assigned this Mar 6, 2025
@imdj imdj force-pushed the main branch 3 times, most recently from d062eba to cc9c0b2 Compare March 7, 2025 04:41
@imdj imdj marked this pull request as ready for review March 7, 2025 08:25
@imdj
Copy link
Contributor Author

imdj commented Mar 8, 2025

Based on initial results from benchmarks this implementation doesn't (yet) offer better performance. I will try to profile it and tweak few things.

@imdj
Copy link
Contributor Author

imdj commented Mar 11, 2025

Any feedback, suggestions, ideas to further boost the performance and improve the PR are welcome.

@mordante
Copy link
Member

Any feedback, suggestions, ideas to further boost the performance and improve the PR are welcome.

Can you post the benchmarks for before and after your change?

@imdj
Copy link
Contributor Author

imdj commented Mar 11, 2025

Can you post the benchmarks for before and after your change?

I noticed the major difference is during bm_ranges_is_permutation_diff_last test

Result comparison:
Benchmark                                                   Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------
bm_std_is_permutation_same/1                             -0.0004         -0.0002             3             3             3             3
bm_std_is_permutation_same/2                             -0.0033         -0.0033             5             5             5             5
bm_std_is_permutation_same/3                             +0.0004         +0.0002             6             6             6             6
bm_std_is_permutation_same/4                             -0.0006         -0.0004             6             6             6             6
bm_std_is_permutation_same/5                             +0.1893         +0.1892             6             7             6             7
bm_std_is_permutation_same/6                             +0.0003         +0.0001             7             7             7             7
bm_std_is_permutation_same/7                             +0.0022         +0.0024             7             7             7             7
bm_std_is_permutation_same/8                             +0.1331         +0.1329             7             8             7             8
bm_std_is_permutation_same/15                            +0.0065         +0.0069             9             9             9             9
bm_std_is_permutation_same/16                            +0.0084         +0.0084            10            10            10            10
bm_std_is_permutation_same/17                            -0.0007         -0.0009            10            10            10            10
bm_std_is_permutation_same/31                            -0.0008         -0.0009            17            16            17            16
bm_std_is_permutation_same/32                            +0.0021         +0.0017            31            31            31            31
bm_std_is_permutation_same/33                            +0.0062         +0.0062            18            18            18            18
bm_std_is_permutation_same/63                            -0.0021         -0.0023            33            33            33            33
bm_std_is_permutation_same/64                            -0.0020         -0.0020            33            33            33            33
bm_std_is_permutation_same/65                            -0.0017         -0.0019            34            34            34            34
bm_std_is_permutation_same/127                           -0.0178         -0.0178            81            79            81            79
bm_std_is_permutation_same/128                           -0.0121         -0.0123            80            79            80            79
bm_std_is_permutation_same/129                           +0.0024         +0.0026            80            80            80            80
bm_std_is_permutation_same/255                           -0.0004         -0.0004           144           144           144           144
bm_std_is_permutation_same/256                           -0.0016         -0.0016           144           144           144           144
bm_std_is_permutation_same/257                           -0.0063         -0.0063           144           143           144           143
bm_std_is_permutation_same/511                           +0.0028         +0.0026           272           273           272           273
bm_std_is_permutation_same/512                           -0.0014         -0.0014           272           272           272           272
bm_std_is_permutation_same/513                           +0.0013         +0.0011           271           271           271           271
bm_std_is_permutation_same/1023                          -0.0004         -0.0002           528           528           528           528
bm_std_is_permutation_same/1024                          +0.0010         +0.0010           529           530           529           530
bm_std_is_permutation_same/1025                          -0.0012         -0.0010           530           529           530           529
bm_std_is_permutation_same/2047                          +0.0000         +0.0000          1044          1044          1044          1044
bm_std_is_permutation_same/2048                          +0.0006         +0.0004          1044          1045          1044          1045
bm_std_is_permutation_same/2049                          +0.0000         +0.0002          1044          1044          1044          1044
bm_std_is_permutation_same/4095                          -0.0008         -0.0010          2092          2091          2092          2090
bm_std_is_permutation_same/4096                          -0.0004         -0.0002          2090          2089          2090          2089
bm_std_is_permutation_same/4097                          -0.0004         -0.0006          2088          2087          2088          2087
bm_std_is_permutation_same/8191                          -0.0006         -0.0004          4152          4149          4151          4149
bm_std_is_permutation_same/8192                          -0.0004         -0.0006          4148          4146          4147          4145
bm_std_is_permutation_same/8193                          -0.0002         -0.0002          4146          4145          4146          4145
bm_std_is_permutation_same/16383                         +0.0062         +0.0060          8309          8361          8309          8359
bm_std_is_permutation_same/16384                         +0.0063         +0.0065          8309          8361          8307          8361
bm_std_is_permutation_same/16385                         +0.0110         +0.0110          8308          8399          8308          8399
bm_std_is_permutation_same/32767                         -0.0078         -0.0075         17159         17026         17155         17026
bm_std_is_permutation_same/32768                         -0.0076         -0.0076         17155         17025         17154         17025
bm_std_is_permutation_same/32769                         -0.0077         -0.0079         17159         17027         17159         17023
bm_std_is_permutation_same/65535                         -0.0165         -0.0166         35562         34974         35562         34972
bm_std_is_permutation_same/65536                         -0.0168         -0.0167         35573         34977         35564         34969
bm_std_is_permutation_same/65537                         -0.0168         -0.0168         35569         34971         35567         34970
bm_std_is_permutation_same/131071                        -0.0103         -0.0105         71298         70564         71298         70548
bm_std_is_permutation_same/131072                        -0.0130         -0.0130         71446         70520         71444         70517
bm_std_is_permutation_same/131073                        -0.0140         -0.0141         71349         70347         71347         70342
bm_std_is_permutation_same/262143                        -0.0035         -0.0033        141974        141475        141939        141474
bm_std_is_permutation_same/262144                        -0.0016         -0.0016        141531        141303        141528        141299
bm_std_is_permutation_same/262145                        -0.0037         -0.0038        141938        141407        141908        141375
bm_std_is_permutation_same/524287                        -0.0027         -0.0027        285604        284838        285603        284827
bm_std_is_permutation_same/524288                        -0.0124         -0.0122        286537        282988        286423        282918
bm_std_is_permutation_same/524289                        -0.0104         -0.0102        285392        282435        285336        282424
bm_ranges_is_permutation_same/1                          -0.0007         -0.0003             5             5             5             5
bm_ranges_is_permutation_same/2                          -0.0005         -0.0004             6             6             6             6
bm_ranges_is_permutation_same/3                          +0.0001         -0.0000             7             7             7             7
bm_ranges_is_permutation_same/4                          -0.0004         -0.0004             9             9             9             9
bm_ranges_is_permutation_same/5                          -0.0000         -0.0002            10            10            10            10
bm_ranges_is_permutation_same/6                          -0.0005         -0.0003            11            11            11            11
bm_ranges_is_permutation_same/7                          +0.0130         +0.0129            12            12            12            12
bm_ranges_is_permutation_same/8                          +0.0015         +0.0015            13            13            13            13
bm_ranges_is_permutation_same/15                         +0.0002         +0.0002            22            22            22            22
bm_ranges_is_permutation_same/16                         +0.0002         +0.0002            23            23            23            23
bm_ranges_is_permutation_same/17                         +0.0000         +0.0000            24            24            24            24
bm_ranges_is_permutation_same/31                         -0.0053         -0.0055            51            51            51            51
bm_ranges_is_permutation_same/32                         -0.0000         -0.0000            42            42            42            42
bm_ranges_is_permutation_same/33                         +0.0002         +0.0002            43            43            43            43
bm_ranges_is_permutation_same/63                         -0.0000         -0.0000            80            80            80            80
bm_ranges_is_permutation_same/64                         -0.0005         -0.0007            81            81            81            81
bm_ranges_is_permutation_same/65                         -0.0002         -0.0000            82            82            82            82
bm_ranges_is_permutation_same/127                        +0.0015         +0.0015           172           172           172           172
bm_ranges_is_permutation_same/128                        +0.0070         +0.0072           173           174           173           174
bm_ranges_is_permutation_same/129                        +0.0010         +0.0010           174           174           174           174
bm_ranges_is_permutation_same/255                        -0.0001         +0.0001           326           326           326           326
bm_ranges_is_permutation_same/256                        -0.0001         -0.0001           327           327           327           327
bm_ranges_is_permutation_same/257                        +0.0008         +0.0010           328           329           328           329
bm_ranges_is_permutation_same/511                        +0.0004         +0.0004           635           635           635           635
bm_ranges_is_permutation_same/512                        +0.0001         -0.0002           636           636           636           636
bm_ranges_is_permutation_same/513                        -0.0002         -0.0000           637           637           637           637
bm_ranges_is_permutation_same/1023                       +0.0001         +0.0001          1252          1252          1252          1252
bm_ranges_is_permutation_same/1024                       -0.0010         -0.0008          1255          1253          1254          1253
bm_ranges_is_permutation_same/1025                       -0.0001         -0.0000          1255          1255          1255          1254
bm_ranges_is_permutation_same/2047                       +0.0011         +0.0011          2488          2491          2488          2490
bm_ranges_is_permutation_same/2048                       -0.0004         -0.0004          2489          2488          2489          2488
bm_ranges_is_permutation_same/2049                       -0.0001         -0.0001          2490          2490          2489          2489
bm_ranges_is_permutation_same/4095                       -0.0000         -0.0000          4954          4954          4954          4953
bm_ranges_is_permutation_same/4096                       -0.0003         -0.0001          4956          4954          4955          4954
bm_ranges_is_permutation_same/4097                       +0.0000         -0.0002          4956          4956          4956          4955
bm_ranges_is_permutation_same/8191                       +0.0002         +0.0002          9891          9893          9891          9893
bm_ranges_is_permutation_same/8192                       -0.0001         -0.0001          9895          9894          9894          9894
bm_ranges_is_permutation_same/8193                       -0.0002         -0.0000          9898          9896          9896          9896
bm_ranges_is_permutation_same/16383                      -0.0001         -0.0001         19776         19773         19775         19772
bm_ranges_is_permutation_same/16384                      +0.0080         +0.0080         19779         19937         19775         19932
bm_ranges_is_permutation_same/16385                      -0.0001         -0.0001         19776         19774         19776         19774
bm_ranges_is_permutation_same/32767                      -0.0003         -0.0003         39575         39562         39567         39554
bm_ranges_is_permutation_same/32768                      -0.0036         -0.0036         39700         39555         39699         39554
bm_ranges_is_permutation_same/32769                      -0.0003         +0.0001         39581         39568         39564         39567
bm_ranges_is_permutation_same/65535                      -0.0002         -0.0000         79136         79120         79120         79118
bm_ranges_is_permutation_same/65536                      +0.0002         +0.0004         79132         79151         79101         79133
bm_ranges_is_permutation_same/65537                      +0.0003         +0.0003         79114         79137         79112         79135
bm_ranges_is_permutation_same/131071                     +0.0006         +0.0002        158261        158353        158258        158291
bm_ranges_is_permutation_same/131072                     -0.0003         -0.0002        158283        158239        158247        158209
bm_ranges_is_permutation_same/131073                     -0.0004         -0.0004        158280        158217        158275        158212
bm_ranges_is_permutation_same/262143                     -0.0013         -0.0011        316781        316367        316708        316359
bm_ranges_is_permutation_same/262144                     -0.0002         -0.0004        316501        316443        316493        316372
bm_ranges_is_permutation_same/262145                     -0.0001         +0.0001        316473        316432        316401        316429
bm_ranges_is_permutation_same/524287                     -0.0004         -0.0004        633080        632854        633071        632838
bm_ranges_is_permutation_same/524288                     -0.0004         -0.0003        633503        633271        633349        633129
bm_ranges_is_permutation_same/524289                     -0.0003         -0.0003        633223        633053        633209        633044
bm_std_is_permutation_shuffled/1                         +0.0002         +0.0000             3             3             3             3
bm_std_is_permutation_shuffled/2                         -0.0131         -0.0131             4             4             4             4
bm_std_is_permutation_shuffled/3                         -0.0194         -0.0194            26            26            26            26
bm_std_is_permutation_shuffled/4                         +0.0350         +0.0350            35            36            35            36
bm_std_is_permutation_shuffled/5                         +0.0179         +0.0181            46            47            46            47
bm_std_is_permutation_shuffled/6                         +0.0700         +0.0700            64            68            64            68
bm_std_is_permutation_shuffled/7                         +0.0542         +0.0540            79            83            79            83
bm_std_is_permutation_shuffled/8                         -0.0381         -0.0379            95            92            95            92
bm_std_is_permutation_shuffled/15                        +0.0331         +0.0328           238           245           238           245
bm_std_is_permutation_shuffled/16                        +0.0015         +0.0017           254           255           254           255
bm_std_is_permutation_shuffled/17                        +0.0145         +0.0143           279           283           279           283
bm_std_is_permutation_shuffled/31                        -0.0073         -0.0070           821           815           821           815
bm_std_is_permutation_shuffled/32                        +0.0171         +0.0171           813           827           813           827
bm_std_is_permutation_shuffled/33                        +0.0283         +0.0283           858           882           858           882
bm_std_is_permutation_shuffled/63                        -0.0037         -0.0037          3009          2998          3009          2998
bm_std_is_permutation_shuffled/64                        +0.0202         +0.0202          3088          3150          3087          3149
bm_std_is_permutation_shuffled/65                        +0.0052         +0.0052          3198          3214          3198          3214
bm_std_is_permutation_shuffled/127                       -0.0029         -0.0029         14102         14061         14099         14058
bm_std_is_permutation_shuffled/128                       -0.0347         -0.0347         14698         14188         14698         14188
bm_std_is_permutation_shuffled/129                       +0.0091         +0.0089         14401         14532         14401         14529
bm_std_is_permutation_shuffled/255                       -0.0001         +0.0001         50891         50887         50880         50886
bm_std_is_permutation_shuffled/256                       +0.0006         +0.0006         52317         52350         52316         52349
bm_std_is_permutation_shuffled/257                       +0.0062         +0.0062         53020         53350         53009         53337
bm_std_is_permutation_shuffled/511                       +0.0006         +0.0005        187560        187665        187561        187661
bm_std_is_permutation_shuffled/512                       +0.0014         +0.0012        188259        188521        188258        188479
bm_std_is_permutation_shuffled/513                       +0.0041         +0.0040        188671        189436        188668        189431
bm_std_is_permutation_shuffled/1023                      -0.0005         -0.0007        698857        698522        698843        698386
bm_std_is_permutation_shuffled/1024                      -0.0012         -0.0010        700251        699426        700102        699426
bm_std_is_permutation_shuffled/1025                      +0.0002         +0.0002        701337        701474        701337        701457
bm_std_is_permutation_shuffled/2047                      -0.0011         -0.0011       2691107       2688267       2690567       2687692
bm_std_is_permutation_shuffled/2048                      -0.0003         -0.0004       2691481       2690577       2691469       2690499
bm_std_is_permutation_shuffled/2049                      +0.0053         +0.0053       2694752       2709124       2694233       2708549
bm_std_is_permutation_shuffled/4095                      +0.0001         +0.0001      10589321      10590078      10589142      10589746
bm_std_is_permutation_shuffled/4096                      +0.0004         +0.0002      10585031      10589300      10584701      10586801
bm_std_is_permutation_shuffled/4097                      -0.0000         +0.0002      10588896      10588522      10586637      10588336
bm_std_is_permutation_shuffled/8191                      +0.0043         +0.0041      42031443      42211809      42030784      42203148
bm_std_is_permutation_shuffled/8192                      +0.0010         +0.0010      42016455      42059045      42015698      42058013
bm_std_is_permutation_shuffled/8193                      +0.0023         +0.0022      42035499      42130421      42027382      42121379
bm_std_is_permutation_shuffled/16383                     +0.0084         +0.0084     167862607     169269172     167859115     169265197
bm_std_is_permutation_shuffled/16384                     +0.0087         +0.0083     167936024     169390525     167932424     169329880
bm_std_is_permutation_shuffled/16385                     +0.0077         +0.0080     168015881     169314541     167976897     169312823
bm_std_is_permutation_shuffled/32767                     -0.0026         -0.0026     692557414     690734388     692543469     690710627
bm_std_is_permutation_shuffled/32768                     -0.0022         -0.0024     692810260     691317524     692795517     691157703
bm_std_is_permutation_shuffled/32769                     -0.0027         -0.0025     693037475     691141435     692840831     691141413
bm_std_is_permutation_shuffled/65535                     -0.0019         -0.0020    2873088169    2867487999    2872884625    2867144339
bm_std_is_permutation_shuffled/65536                     -0.0003         -0.0003    2869429171    2868493156    2869125188    2868388830
bm_std_is_permutation_shuffled/65537                     -0.0004         -0.0004    2869018543    2867792058    2868799630    2867595502
bm_std_is_permutation_shuffled/131071                    -0.0004         -0.0004   11474642163   11469849517   11473591754   11468810812
bm_std_is_permutation_shuffled/131072                    -0.0005         -0.0005   11476707423   11470887314   11475686253   11469847399
bm_std_is_permutation_shuffled/131073                    -0.0003         -0.0004   11475167455   11471225403   11474012322   11469949622
bm_std_is_permutation_shuffled/262143                    -0.0006         -0.0006   45881668187   45853001955   45877507209   45849075410
bm_std_is_permutation_shuffled/262144                    -0.0011         -0.0011   45896428165   45844911247   45892366261   45840610638
bm_std_is_permutation_shuffled/262145                    +0.0005         +0.0004   45826828632   45848448732   45822862746   45841995903
bm_std_is_permutation_shuffled/524287                    +0.0080         +0.0080  181994177077  183444203623  181977203110  183424253040
bm_std_is_permutation_shuffled/524288                    -0.0016         -0.0017  183548559246  183250724577  183532566493  183229130974
bm_std_is_permutation_shuffled/524289                    -0.0007         -0.0008  183555832342  183420968188  183539700812  183401332759
bm_ranges_is_permutation_shuffled/1                      -0.0001         -0.0001             5             5             5             5
bm_ranges_is_permutation_shuffled/2                      +0.0004         +0.0003             5             5             5             5
bm_ranges_is_permutation_shuffled/3                      -0.0019         -0.0017            26            26            26            26
bm_ranges_is_permutation_shuffled/4                      +0.0038         +0.0036            36            36            36            36
bm_ranges_is_permutation_shuffled/5                      -0.0234         -0.0232            49            48            49            48
bm_ranges_is_permutation_shuffled/6                      +0.0013         +0.0011            70            70            70            70
bm_ranges_is_permutation_shuffled/7                      +0.0326         +0.0326            84            87            84            87
bm_ranges_is_permutation_shuffled/8                      -0.0016         -0.0018            95            95            95            95
bm_ranges_is_permutation_shuffled/15                     -0.0044         -0.0045           252           251           252           251
bm_ranges_is_permutation_shuffled/16                     +0.0032         +0.0029           261           262           261           262
bm_ranges_is_permutation_shuffled/17                     +0.0113         +0.0112           288           291           288           291
bm_ranges_is_permutation_shuffled/31                     +0.0018         +0.0020           806           807           805           807
bm_ranges_is_permutation_shuffled/32                     -0.0048         -0.0048           825           821           825           821
bm_ranges_is_permutation_shuffled/33                     +0.0012         +0.0014           877           878           876           878
bm_ranges_is_permutation_shuffled/63                     +0.0661         +0.0659          2985          3182          2985          3181
bm_ranges_is_permutation_shuffled/64                     -0.0188         -0.0186          3085          3027          3084          3027
bm_ranges_is_permutation_shuffled/65                     +0.0014         +0.0012          3176          3180          3176          3179
bm_ranges_is_permutation_shuffled/127                    -0.0096         -0.0094         14054         13920         14051         13919
bm_ranges_is_permutation_shuffled/128                    -0.0110         -0.0112         14198         14041         14198         14038
bm_ranges_is_permutation_shuffled/129                    -0.0356         -0.0356         14819         14292         14819         14291
bm_ranges_is_permutation_shuffled/255                    -0.0004         -0.0004         51248         51226         51237         51216
bm_ranges_is_permutation_shuffled/256                    -0.0072         -0.0071         52927         52548         52925         52548
bm_ranges_is_permutation_shuffled/257                    -0.0095         -0.0093         53811         53299         53800         53298
bm_ranges_is_permutation_shuffled/511                    -0.0024         -0.0026        187997        187547        187997        187508
bm_ranges_is_permutation_shuffled/512                    -0.0038         -0.0038        188587        187868        188587        187862
bm_ranges_is_permutation_shuffled/513                    -0.0061         -0.0063        189553        188394        189550        188353
bm_ranges_is_permutation_shuffled/1023                   -0.0003         -0.0003        698451        698224        698421        698224
bm_ranges_is_permutation_shuffled/1024                   -0.0020         -0.0018        700691        699314        700540        699314
bm_ranges_is_permutation_shuffled/1025                   -0.0022         -0.0024        701910        700357        701906        700206
bm_ranges_is_permutation_shuffled/2047                   -0.0013         -0.0011       2690816       2687429       2690279       2687326
bm_ranges_is_permutation_shuffled/2048                   -0.0016         -0.0018       2693467       2689161       2693468       2688488
bm_ranges_is_permutation_shuffled/2049                   -0.0007         -0.0005       2694598       2692701       2694077       2692623
bm_ranges_is_permutation_shuffled/4095                   -0.0013         -0.0015      10594511      10580690      10594413      10578334
bm_ranges_is_permutation_shuffled/4096                   -0.0009         -0.0009      10587811      10578635      10587508      10578508
bm_ranges_is_permutation_shuffled/4097                   -0.0008         -0.0007      10590886      10582784      10588513      10580704
bm_ranges_is_permutation_shuffled/8191                   -0.0001         -0.0000      42030215      42027790      42029023      42027666
bm_ranges_is_permutation_shuffled/8192                   -0.0001         -0.0002      42048163      42045729      42047317      42036877
bm_ranges_is_permutation_shuffled/8193                   -0.0006         -0.0005      42072330      42045753      42063663      42044714
bm_ranges_is_permutation_shuffled/16383                  +0.0106         +0.0104     167216396     168991582     167211802     168951945
bm_ranges_is_permutation_shuffled/16384                  +0.0102         +0.0102     167282705     168985176     167278536     168983595
bm_ranges_is_permutation_shuffled/16385                  +0.0098         +0.0096     167326336     168958957     167321688     168922306
bm_ranges_is_permutation_shuffled/32767                  +0.0115         +0.0115     685566768     693445790     685565275     693429650
bm_ranges_is_permutation_shuffled/32768                  +0.0119         +0.0122     685836889     694024803     685662958     694000649
bm_ranges_is_permutation_shuffled/32769                  +0.0119         +0.0118     686015852     694202242     685995474     694063013
bm_ranges_is_permutation_shuffled/65535                  -0.0039         -0.0039    2866882853    2855634883    2866689681    2855440699
bm_ranges_is_permutation_shuffled/65536                  -0.0047         -0.0046    2868211379    2854859954    2867879517    2854662950
bm_ranges_is_permutation_shuffled/65537                  -0.0065         -0.0065    2873932848    2855274529    2873741664    2854933990
bm_ranges_is_permutation_shuffled/131071                 -0.0006         -0.0006   11473187791   11465760180   11472142532   11464709497
bm_ranges_is_permutation_shuffled/131072                 -0.0003         -0.0003   11468657936   11465564690   11467620315   11464356593
bm_ranges_is_permutation_shuffled/131073                 -0.0002         -0.0002   11469740830   11467570816   11468851763   11466357589
bm_ranges_is_permutation_shuffled/262143                 -0.0047         -0.0047   46070737348   45856328635   46066570207   45852302232
bm_ranges_is_permutation_shuffled/262144                 -0.0045         -0.0045   46056268272   45849827037   46048913777   45842214423
bm_ranges_is_permutation_shuffled/262145                 -0.0039         -0.0040   46041119561   45861110935   46034039004   45851152772
bm_ranges_is_permutation_shuffled/524287                 -0.0093         -0.0093  185258782710  183527790482  185232217117  183508227839
bm_ranges_is_permutation_shuffled/524288                 -0.0094         -0.0094  185266962888  183527491037  185242322875  183495078047
bm_ranges_is_permutation_shuffled/524289                 -0.0046         -0.0045  185294226235  184447066203  185264050192  184428782015
bm_std_is_permutation_diff_last/1                        -0.1251         -0.1251             3             3             3             3
bm_std_is_permutation_diff_last/2                        -0.1818         -0.1818             4             4             4             4
bm_std_is_permutation_diff_last/3                        -0.1098         -0.1098             5             4             5             4
bm_std_is_permutation_diff_last/4                        -0.0002         -0.0001             5             5             5             5
bm_std_is_permutation_diff_last/5                        +0.0174         +0.0173             5             5             5             5
bm_std_is_permutation_diff_last/6                        -0.0346         -0.0346             6             6             6             6
bm_std_is_permutation_diff_last/7                        -0.0074         -0.0074             7             6             7             6
bm_std_is_permutation_diff_last/8                        +0.0079         +0.0076             7             7             7             7
bm_std_is_permutation_diff_last/15                       -0.0075         -0.0073            10            10            10            10
bm_std_is_permutation_diff_last/16                       -0.0715         -0.0715            11            10            11            10
bm_std_is_permutation_diff_last/17                       -0.0359         -0.0359            11            11            11            11
bm_std_is_permutation_diff_last/31                       -0.0548         -0.0548            18            17            18            17
bm_std_is_permutation_diff_last/32                       -0.0136         -0.0138            18            17            18            17
bm_std_is_permutation_diff_last/33                       -0.0813         -0.0814            39            36            39            36
bm_std_is_permutation_diff_last/63                       -0.0123         -0.0122            33            33            33            33
bm_std_is_permutation_diff_last/64                       -0.0031         -0.0031            33            33            33            33
bm_std_is_permutation_diff_last/65                       -0.0032         -0.0031            34            34            34            34
bm_std_is_permutation_diff_last/127                      -0.0041         -0.0041            78            78            78            78
bm_std_is_permutation_diff_last/128                      -0.0070         -0.0071            79            78            79            78
bm_std_is_permutation_diff_last/129                      -0.0050         -0.0050            79            78            79            78
bm_std_is_permutation_diff_last/255                      -0.0031         -0.0031           142           142           142           142
bm_std_is_permutation_diff_last/256                      -0.0030         -0.0030           143           142           143           142
bm_std_is_permutation_diff_last/257                      -0.0005         -0.0006           143           143           143           143
bm_std_is_permutation_diff_last/511                      -0.0014         -0.0013           271           271           271           271
bm_std_is_permutation_diff_last/512                      -0.0015         -0.0014           271           271           271           271
bm_std_is_permutation_diff_last/513                      -0.0016         -0.0016           272           271           272           271
bm_std_is_permutation_diff_last/1023                     -0.0049         -0.0049           531           528           531           528
bm_std_is_permutation_diff_last/1024                     -0.0007         -0.0007           529           528           529           528
bm_std_is_permutation_diff_last/1025                     -0.0006         -0.0005           529           529           529           529
bm_std_is_permutation_diff_last/2047                     -0.0006         -0.0006          1043          1043          1043          1042
bm_std_is_permutation_diff_last/2048                     -0.0002         -0.0003          1044          1043          1043          1043
bm_std_is_permutation_diff_last/2049                     -0.0002         -0.0003          1044          1043          1044          1043
bm_std_is_permutation_diff_last/4095                     +0.0019         +0.0015          2089          2093          2088          2092
bm_std_is_permutation_diff_last/4096                     -0.0003         -0.0002          2096          2095          2096          2095
bm_std_is_permutation_diff_last/4097                     -0.0028         -0.0029          2101          2095          2101          2095
bm_std_is_permutation_diff_last/8191                     +0.0005         +0.0005          4149          4151          4149          4151
bm_std_is_permutation_diff_last/8192                     +0.0008         +0.0008          4151          4154          4150          4153
bm_std_is_permutation_diff_last/8193                     +0.0004         +0.0004          4151          4153          4151          4153
bm_std_is_permutation_diff_last/16383                    -0.2027         -0.2026         10466          8345         10463          8343
bm_std_is_permutation_diff_last/16384                    -0.2033         -0.2032         10471          8343         10470          8343
bm_std_is_permutation_diff_last/16385                    -0.2001         -0.2002         10434          8346         10434          8345
bm_std_is_permutation_diff_last/32767                    +0.0024         +0.0025         17417         17459         17412         17455
bm_std_is_permutation_diff_last/32768                    +0.0025         +0.0026         17410         17454         17409         17453
bm_std_is_permutation_diff_last/32769                    +0.0024         +0.0025         17414         17456         17409         17451
bm_std_is_permutation_diff_last/65535                    -0.0164         -0.0164         35945         35356         35941         35353
bm_std_is_permutation_diff_last/65536                    -0.0154         -0.0154         35923         35369         35912         35358
bm_std_is_permutation_diff_last/65537                    -0.0154         -0.0154         35920         35366         35918         35363
bm_std_is_permutation_diff_last/131071                   -0.0198         -0.0197         72308         70874         72286         70859
bm_std_is_permutation_diff_last/131072                   -0.0179         -0.0178         72137         70849         72131         70848
bm_std_is_permutation_diff_last/131073                   -0.0176         -0.0176         72145         70873         72126         70858
bm_std_is_permutation_diff_last/262143                   -0.0040         -0.0039        141768        141205        141756        141202
bm_std_is_permutation_diff_last/262144                   -0.0021         -0.0020        141508        141217        141500        141213
bm_std_is_permutation_diff_last/262145                   -0.0021         -0.0020        141555        141264        141513        141237
bm_std_is_permutation_diff_last/524287                   -0.0069         -0.0068        285539        283563        285506        283556
bm_std_is_permutation_diff_last/524288                   -0.0080         -0.0081        285543        283269        285513        283211
bm_std_is_permutation_diff_last/524289                   -0.0064         -0.0063        285337        283520        285320        283519
bm_ranges_is_permutation_diff_last/1                     +0.0003         +0.0004             8             8             8             8
bm_ranges_is_permutation_diff_last/2                     +0.1625         +0.1625            10            11            10            11
bm_ranges_is_permutation_diff_last/3                     +0.2175         +0.2176            10            12            10            12
bm_ranges_is_permutation_diff_last/4                     +0.2579         +0.2580            11            14            11            14
bm_ranges_is_permutation_diff_last/5                     +0.2748         +0.2748            12            15            12            15
bm_ranges_is_permutation_diff_last/6                     +0.2889         +0.2893            12            16            12            16
bm_ranges_is_permutation_diff_last/7                     +0.2858         +0.2860            13            17            13            17
bm_ranges_is_permutation_diff_last/8                     +0.2762         +0.2762            14            18            14            18
bm_ranges_is_permutation_diff_last/15                    +0.3667         +0.3667            20            27            20            27
bm_ranges_is_permutation_diff_last/16                    +0.2949         +0.2950            22            28            22            28
bm_ranges_is_permutation_diff_last/17                    +0.3767         +0.3768            21            29            21            29
bm_ranges_is_permutation_diff_last/31                    +0.4163         +0.4167            33            46            33            46
bm_ranges_is_permutation_diff_last/32                    +0.3689         +0.3687            35            47            35            47
bm_ranges_is_permutation_diff_last/33                    +0.4191         +0.4198            34            49            34            49
bm_ranges_is_permutation_diff_last/63                    +0.4547         +0.4544            58            85            58            85
bm_ranges_is_permutation_diff_last/64                    +0.4264         +0.4265            60            86            60            86
bm_ranges_is_permutation_diff_last/65                    +0.4537         +0.4543            60            87            60            87
bm_ranges_is_permutation_diff_last/127                   +0.4154         +0.4153           122           173           122           173
bm_ranges_is_permutation_diff_last/128                   +0.4238         +0.4241           122           174           122           174
bm_ranges_is_permutation_diff_last/129                   +0.4227         +0.4224           123           175           123           175
bm_ranges_is_permutation_diff_last/255                   +0.4563         +0.4566           225           327           225           327
bm_ranges_is_permutation_diff_last/256                   +0.4578         +0.4576           225           328           225           328
bm_ranges_is_permutation_diff_last/257                   +0.4604         +0.4608           226           330           226           330
bm_ranges_is_permutation_diff_last/511                   +0.4790         +0.4788           430           636           430           636
bm_ranges_is_permutation_diff_last/512                   +0.4771         +0.4775           431           637           431           637
bm_ranges_is_permutation_diff_last/513                   +0.4776         +0.4777           432           638           432           638
bm_ranges_is_permutation_diff_last/1023                  +0.4878         +0.4883           842          1253           842          1253
bm_ranges_is_permutation_diff_last/1024                  +0.4889         +0.4894           843          1255           842          1255
bm_ranges_is_permutation_diff_last/1025                  +0.4917         +0.4915           843          1257           843          1257
bm_ranges_is_permutation_diff_last/2047                  +0.4947         +0.4948          1665          2488          1665          2488
bm_ranges_is_permutation_diff_last/2048                  +0.4941         +0.4942          1666          2490          1666          2489
bm_ranges_is_permutation_diff_last/2049                  +0.4940         +0.4945          1667          2491          1667          2491
bm_ranges_is_permutation_diff_last/4095                  +0.4952         +0.4959          3319          4962          3316          4961
bm_ranges_is_permutation_diff_last/4096                  +0.4963         +0.4964          3316          4962          3316          4962
bm_ranges_is_permutation_diff_last/4097                  +0.4969         +0.4969          3318          4967          3318          4967
bm_ranges_is_permutation_diff_last/8191                  +0.4983         +0.4987          6607          9899          6605          9899
bm_ranges_is_permutation_diff_last/8192                  +0.5000         +0.5001          6605          9907          6604          9907
bm_ranges_is_permutation_diff_last/8193                  +0.4983         +0.4986          6608          9900          6606          9900
bm_ranges_is_permutation_diff_last/16383                 +0.4991         +0.4991         13203         19792         13200         19788
bm_ranges_is_permutation_diff_last/16384                 +0.4988         +0.4989         13202         19787         13201         19787
bm_ranges_is_permutation_diff_last/16385                 +0.4990         +0.4988         13204         19792         13203         19788
bm_ranges_is_permutation_diff_last/32767                 +0.4933         +0.4934         26498         39570         26496         39569
bm_ranges_is_permutation_diff_last/32768                 +0.4937         +0.4938         26502         39587         26494         39579
bm_ranges_is_permutation_diff_last/32769                 +0.4937         +0.4939         26498         39580         26495         39580
bm_ranges_is_permutation_diff_last/65535                 +0.4725         +0.4726         53750         79146         53733         79128
bm_ranges_is_permutation_diff_last/65536                 +0.4721         +0.4722         53757         79138         53753         79135
bm_ranges_is_permutation_diff_last/65537                 +0.4731         +0.4728         53741         79164         53739         79146
bm_ranges_is_permutation_diff_last/131071                +0.4711         +0.4711        107567        158239        107562        158236
bm_ranges_is_permutation_diff_last/131072                +0.4712         +0.4714        107574        158262        107560        158262
bm_ranges_is_permutation_diff_last/131073                +0.4711         +0.4712        107594        158277        107561        158244
bm_ranges_is_permutation_diff_last/262143                +0.4697         +0.4698        215311        316436        215294        316433
bm_ranges_is_permutation_diff_last/262144                +0.4691         +0.4692        215391        316424        215372        316425
bm_ranges_is_permutation_diff_last/262145                +0.4694         +0.4695        215330        316410        215322        316410
bm_ranges_is_permutation_diff_last/524287                +0.4681         +0.4681        431391        633320        431281        633183
bm_ranges_is_permutation_diff_last/524288                +0.4673         +0.4674        431481        633128        431455        633110
bm_ranges_is_permutation_diff_last/524289                +0.4687         +0.4685        431358        633533        431321        633394
OVERALL_GEOMEAN                                          +0.0568         +0.0568             0             0             0             0

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run the new benchmarks locally and compare before/after your patch, this is what I get:

Results
Comparing build/default/libcxx/test/benchmarks/algorithms/nonmodifying/Output/is_permutation.bench.cpp.dir/benchmark-result.json to build/candidate/libcxx/test/benchmarks/algorithms/nonmodifying/Output/is_permutation.bench.cpp.dir/benchmark-result.json
Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
std::is_permutation(vector<int>) (3leg) (common prefix)/8                         +0.0328         +0.0303             4             4             4             4
std::is_permutation(vector<int>) (3leg) (common prefix)/1024                      -0.1053         -0.1055           548           490           547           490
std::is_permutation(vector<int>) (3leg) (common prefix)/8192                      +0.0017         +0.0005          3925          3931          3918          3919
std::is_permutation(deque<int>) (3leg) (common prefix)/8                          +0.0026         +0.0026             7             7             7             7
std::is_permutation(deque<int>) (3leg) (common prefix)/1024                       +0.0149         +0.0105           657           667           657           664
std::is_permutation(deque<int>) (3leg) (common prefix)/8192                       +0.0104         +0.0070          5223          5277          5219          5256
std::is_permutation(list<int>) (3leg) (common prefix)/8                           +0.0475         +0.0485             5             5             5             5
std::is_permutation(list<int>) (3leg) (common prefix)/1024                        +0.4774         +0.4798          1095          1618          1092          1615
std::is_permutation(list<int>) (3leg) (common prefix)/8192                        +0.0071         +0.0071         13884         13983         13860         13959
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8                   +0.0399         +0.0389             4             4             4             4
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/1024                -0.0139         -0.0111           461           454           459           454
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8192                +0.0212         +0.0181          3528          3603          3527          3591
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/8                    +0.0034         +0.0032             8             8             8             8
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/1024                 +0.0219         +0.0197           769           786           769           784
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/8192                 +0.0147         +0.0138          6121          6211          6121          6205
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8                     +0.0153         +0.0153             6             6             6             6
std::is_permutation(list<int>) (3leg, pred) (common prefix)/1024                  +0.0654         +0.0628          1111          1184          1111          1181
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8192                  +0.3435         +0.3437         11485         15431         11483         15429
std::is_permutation(vector<int>) (4leg) (common prefix)/8                         +0.0604         +0.0598             6             7             6             7
std::is_permutation(vector<int>) (4leg) (common prefix)/1024                      +0.0248         +0.0243           572           586           572           586
std::is_permutation(vector<int>) (4leg) (common prefix)/8192                      +0.0091         +0.0091          4520          4561          4520          4561
std::is_permutation(deque<int>) (4leg) (common prefix)/8                          +0.0566         +0.0564            12            12            12            12
std::is_permutation(deque<int>) (4leg) (common prefix)/1024                       +0.0007         +0.0004           811           811           811           811
std::is_permutation(deque<int>) (4leg) (common prefix)/8192                       +0.0008         +0.0007          6440          6445          6439          6444
std::is_permutation(list<int>) (4leg) (common prefix)/8                           -0.0102         -0.0102             7             7             7             7
std::is_permutation(list<int>) (4leg) (common prefix)/1024                        +0.0079         +0.0079          1070          1079          1070          1078
std::is_permutation(list<int>) (4leg) (common prefix)/8192                        +0.0485         +0.0481         12583         13194         12583         13189
rng::is_permutation(vector<int>) (4leg) (common prefix)/8                         +0.1462         +0.1173             7             7             7             7
rng::is_permutation(vector<int>) (4leg) (common prefix)/1024                      +0.0298         +0.0246           583           600           583           597
rng::is_permutation(vector<int>) (4leg) (common prefix)/8192                      -0.0308         -0.0206          4738          4592          4684          4588
rng::is_permutation(deque<int>) (4leg) (common prefix)/8                          +0.2257         +0.2283            10            13            10            13
rng::is_permutation(deque<int>) (4leg) (common prefix)/1024                       -0.0090         -0.0062           825           817           821           816
rng::is_permutation(deque<int>) (4leg) (common prefix)/8192                       +0.0046         +0.0050          6465          6495          6454          6486
rng::is_permutation(list<int>) (4leg) (common prefix)/8                           +0.0054         +0.0048             7             7             7             7
rng::is_permutation(list<int>) (4leg) (common prefix)/1024                        +0.0921         +0.0527          1074          1173          1074          1130
rng::is_permutation(list<int>) (4leg) (common prefix)/8192                        -0.0457         -0.0489         13452         12837         13452         12795
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/8                   +0.1036         +0.1025             6             7             6             7
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/1024                -0.0533         -0.0534           607           574           607           574
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/8192                -0.0726         -0.0705          4820          4470          4808          4469
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.0179         +0.0174            11            12            11            12
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 -0.0014         +0.0016           854           853           851           852
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.0032         +0.0044          6766          6788          6755          6785
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8                     -0.0140         -0.0148             8             8             8             8
std::is_permutation(list<int>) (4leg, pred) (common prefix)/1024                  +0.0163         +0.0147          1179          1199          1179          1197
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  -0.0326         -0.0329         14402         13932         14400         13926
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/8                   +0.1177         +0.1137             6             7             6             7
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/1024                -0.0608         -0.0626           609           572           609           570
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/8192                -0.0641         -0.0643          4811          4503          4808          4499
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.0350         +0.0347            11            12            11            12
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 +0.0221         +0.0180           849           868           849           864
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.0028         +0.0022          6775          6794          6775          6790
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8                     -0.0094         -0.0094             8             8             8             8
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/1024                  -0.0004         -0.0003          1182          1182          1182          1182
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  -0.1109         -0.1109         12824         11402         12823         11401
std::is_permutation(vector<int>) (3leg) (shuffled)/8                              +0.1544         +0.1540            49            56            49            56
std::is_permutation(vector<int>) (3leg) (shuffled)/1024                           -0.0041         -0.0042        417690        415998        417613        415878
std::is_permutation(deque<int>) (3leg) (shuffled)/8                               +0.0312         +0.0311            72            74            72            74
std::is_permutation(deque<int>) (3leg) (shuffled)/1024                            -0.0039         -0.0038        976937        973173        976905        973173
std::is_permutation(list<int>) (3leg) (shuffled)/8                                +0.0698         +0.0696            60            64            60            64
std::is_permutation(list<int>) (3leg) (shuffled)/1024                             +0.0014         +0.0008       2016154       2019032       2016155       2017847
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/8                        +0.0055         +0.0066            62            62            61            62
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/1024                     -0.0142         -0.0122       1070694       1055456       1068171       1055100
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/8                         -0.0045         -0.0024            81            81            81            81
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/1024                      -0.0031         -0.0026       1167616       1164036       1166812       1163794
std::is_permutation(list<int>) (3leg, pred) (shuffled)/8                          -0.1736         -0.1734            91            75            91            75
std::is_permutation(list<int>) (3leg, pred) (shuffled)/1024                       +0.0245         +0.0251       2229788       2284486       2228285       2284297
std::is_permutation(vector<int>) (4leg) (shuffled)/8                              +0.1556         +0.1562            49            57            49            57
std::is_permutation(vector<int>) (4leg) (shuffled)/1024                           -0.0037         -0.0024        413315        411780        412744        411745
std::is_permutation(deque<int>) (4leg) (shuffled)/8                               +0.0179         +0.0185            78            80            78            80
std::is_permutation(deque<int>) (4leg) (shuffled)/1024                            +0.0060         +0.0060        975699        981570        975697        981572
std::is_permutation(list<int>) (4leg) (shuffled)/8                                +0.0390         +0.0391            60            63            60            63
std::is_permutation(list<int>) (4leg) (shuffled)/1024                             -0.0042         -0.0034       2018535       2009991       2016729       2009779
rng::is_permutation(vector<int>) (4leg) (shuffled)/8                              +0.1454         +0.1470            49            56            49            56
rng::is_permutation(vector<int>) (4leg) (shuffled)/1024                           -0.0226         -0.0223        421068        411541        420894        411504
rng::is_permutation(deque<int>) (4leg) (shuffled)/8                               +0.0739         +0.0734            76            81            76            81
rng::is_permutation(deque<int>) (4leg) (shuffled)/1024                            -0.0013         -0.0017        991798        990556        990445        988751
rng::is_permutation(list<int>) (4leg) (shuffled)/8                                +0.0279         +0.0286            61            63            61            63
rng::is_permutation(list<int>) (4leg) (shuffled)/1024                             +0.0008         +0.0011       2013143       2014835       2011920       2014227
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        +0.0090         +0.0091            61            61            61            61
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/1024                     -0.2018         -0.2012       1058702        845087       1057847        845031
std::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         +0.1386         +0.1383            81            92            81            92
std::is_permutation(deque<int>) (4leg, pred) (shuffled)/1024                      +0.0404         +0.0405       1164023       1211074       1163886       1211007
std::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          -0.0203         -0.0199            77            75            77            75
std::is_permutation(list<int>) (4leg, pred) (shuffled)/1024                       -0.0073         -0.0065       2302394       2285626       2300674       2285627
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        -0.0040         -0.0024            61            61            61            61
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/1024                     -0.1933         -0.1933       1047648        845138       1047652        845093
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         +0.1698         +0.1701            79            92            79            92
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/1024                      +0.0406         +0.0407       1163094       1210313       1162968       1210314
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          -0.0023         -0.0031            76            76            76            76
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/1024                       +0.0127         +0.0113       2294352       2323538       2292282       2318209
OVERALL_GEOMEAN                                                                   +0.0188         +0.0180             0             0             0             0

I think that's really interesting. Observations:

  1. We're doing much worse with the new implementation on std::list and std::deque. I don't understand that, that needs investigation.
  2. We're not doing better on std::vector like we would assume since std::mismatch is vectorized. The root cause here seems to be that we don't properly forward the knowledge that the predicate is std::equal_to to the call to std::mismatch. I think that might be due to the use of reference_wrapper, which might inhibit this check. If that's the case, we could avoid using std::ref when we call mismatch, but we should probably fix the underlying issue by making sure that __desugars_to<__equal_tag, ...> understands when it gets passed a reference_wrapper. That seems like a general thing we should fix if it's broken, and that's actually the target of #129312.

I think those are two good directions for investigating, please let me know if you have questions!

@ldionne
Copy link
Member

ldionne commented Mar 19, 2025

What platform are you running your benchmarks on? I also just discovered that we were not enabling vectorization in mismatch on AppleClang due to a subtle mistake, so that could add to the noise here. Indeed, this is the speedup I get when I drop std::ref from the mismatch call and ensure that we vectorize on AppleClang:

Benchmark                                                                      Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------
std::is_permutation(vector<int>) (3leg) (common prefix)/8                   -0.4075         -0.4075             4             2             4             2
std::is_permutation(vector<int>) (3leg) (common prefix)/1024                -0.8155         -0.8154           548           101           547           101
std::is_permutation(vector<int>) (3leg) (common prefix)/8192                -0.7890         -0.7887          3925           828          3918           828
OVERALL_GEOMEAN                                                             -0.9526         -0.9526             0             0             0             0

Edit: The AppleClang issue should be solved by #132090.
Edit 2: The reference_wrapper issue should be solved by #132092

@imdj
Copy link
Contributor Author

imdj commented Mar 19, 2025

Thank you for the feedback. I'll update the repo, incorporate those changes, and try to investigate further accordingly.

What platform are you running your benchmarks on?

I'm using Linux (openSUSE).

Replace hand-written loops with vectorized std::mismatch, std::find_if, and std::count_if
@imdj imdj force-pushed the main branch 3 times, most recently from 7bfebb7 to 659e8a5 Compare March 20, 2025 09:28
@imdj
Copy link
Contributor Author

imdj commented Mar 20, 2025

This are the benchmark results I'm getting now after adding the second mismatch and incorporating #132090 and #132090 locally.

Results:
Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
std::is_permutation(vector<int>) (3leg) (common prefix)/8                         -0.0089         -0.0088             6             6             6             6
std::is_permutation(vector<int>) (3leg) (common prefix)/1024                      +0.0001         +0.0002           529           529           529           529
std::is_permutation(vector<int>) (3leg) (common prefix)/8192                      +0.0001         +0.0001          4140          4141          4139          4140
std::is_permutation(deque<int>) (3leg) (common prefix)/8                          +0.0002         +0.0002            21            21            21            21
std::is_permutation(deque<int>) (3leg) (common prefix)/1024                       -0.1635         -0.1635          2481          2075          2480          2075
std::is_permutation(deque<int>) (3leg) (common prefix)/8192                       -0.1654         -0.1654         19863         16577         19863         16577
std::is_permutation(list<int>) (3leg) (common prefix)/8                           +0.0014         +0.0014             9             9             9             9
std::is_permutation(list<int>) (3leg) (common prefix)/1024                        -0.0007         -0.0007          3863          3860          3863          3860
std::is_permutation(list<int>) (3leg) (common prefix)/8192                        -0.0119         -0.0118         42229         41729         42219         41719
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8                   +0.0017         +0.0017             7             7             7             7
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/1024                +0.0001         +0.0002           631           631           631           631
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8192                +0.0008         +0.0008          4961          4965          4961          4965
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/8                    +0.0886         +0.0886            16            18            16            18
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/1024                 +0.0017         +0.0015          1256          1258          1256          1258
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/8192                 -0.0101         -0.0101         10086          9985         10086          9984
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8                     -0.0001         -0.0001            10            10            10            10
std::is_permutation(list<int>) (3leg, pred) (common prefix)/1024                  +0.0034         +0.0034          4194          4209          4194          4209
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8192                  -0.0299         -0.0301         41928         40673         41927         40664
std::is_permutation(vector<int>) (4leg) (common prefix)/8                         +0.0101         +0.0101            14            14            14            14
std::is_permutation(vector<int>) (4leg) (common prefix)/1024                      -0.0004         -0.0002           843           842           842           842
std::is_permutation(vector<int>) (4leg) (common prefix)/8192                      +0.0001         +0.0003          6607          6607          6605          6607
std::is_permutation(deque<int>) (4leg) (common prefix)/8                          +0.0526         +0.0528            35            37            35            37
std::is_permutation(deque<int>) (4leg) (common prefix)/1024                       +0.1007         +0.1008          2500          2751          2500          2751
std::is_permutation(deque<int>) (4leg) (common prefix)/8192                       +0.0038         +0.0038         22227         22311         22222         22307
std::is_permutation(list<int>) (4leg) (common prefix)/8                           -0.0516         -0.0516            13            12            13            12
std::is_permutation(list<int>) (4leg) (common prefix)/1024                        -0.0036         -0.0036          3913          3899          3912          3898
std::is_permutation(list<int>) (4leg) (common prefix)/8192                        -0.0304         -0.0304         42040         40764         42039         40763
rng::is_permutation(vector<int>) (4leg) (common prefix)/8                         -0.0051         -0.0051            15            15            15            15
rng::is_permutation(vector<int>) (4leg) (common prefix)/1024                      +0.0002         +0.0002           843           843           843           843
rng::is_permutation(vector<int>) (4leg) (common prefix)/8192                      -0.0004         -0.0004          6610          6607          6609          6607
rng::is_permutation(deque<int>) (4leg) (common prefix)/8                          +0.0038         +0.0038            37            37            37            37
rng::is_permutation(deque<int>) (4leg) (common prefix)/1024                       -0.0006         -0.0006          2909          2907          2909          2907
rng::is_permutation(deque<int>) (4leg) (common prefix)/8192                       +0.0007         +0.0007         23177         23192         23176         23192
rng::is_permutation(list<int>) (4leg) (common prefix)/8                           -0.0303         -0.0304            14            13            14            13
rng::is_permutation(list<int>) (4leg) (common prefix)/1024                        -0.0017         -0.0019          3857          3851          3857          3850
rng::is_permutation(list<int>) (4leg) (common prefix)/8192                        -0.0288         -0.0288         41948         40739         41947         40738
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/8                   +0.0001         -0.0001            11            11            11            11
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/1024                +0.0003         +0.0003           838           838           838           838
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/8192                +0.0007         +0.0007          6604          6608          6603          6607
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.0504         +0.0504            33            34            33            34
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 +0.3967         +0.3964          2083          2909          2083          2908
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.3952         +0.3950         16593         23151         16592         23146
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8                     +0.0000         +0.0000            12            12            12            12
std::is_permutation(list<int>) (4leg, pred) (common prefix)/1024                  -0.0051         -0.0051          3872          3852          3872          3852
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  -0.0293         -0.0293         42144         40908         42133         40899
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/8                   +0.0052         +0.0052            11            11            11            11
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/1024                +0.0000         -0.0002           838           838           838           838
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/8192                +0.0237         +0.0237          6603          6759          6603          6759
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.1165         +0.1165            33            37            33            37
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 +0.3948         +0.3948          2083          2906          2083          2906
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.3983         +0.3983         16583         23188         16582         23187
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8                     +0.0007         +0.0007            12            12            12            12
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/1024                  -0.0004         -0.0004          3882          3881          3882          3881
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  -0.0288         -0.0288         41979         40770         41978         40769
std::is_permutation(vector<int>) (3leg) (shuffled)/8                              +0.0126         +0.0126            97            98            97            98
std::is_permutation(vector<int>) (3leg) (shuffled)/1024                           +0.0009         +0.0009        698680        699327        698523        699184
std::is_permutation(deque<int>) (3leg) (shuffled)/8                               +0.0164         +0.0164           220           224           220           224
std::is_permutation(deque<int>) (3leg) (shuffled)/1024                            +0.0004         +0.0004       2986223       2987287       2985635       2986691
std::is_permutation(list<int>) (3leg) (shuffled)/8                                +0.0011         +0.0011           174           174           174           174
std::is_permutation(list<int>) (3leg) (shuffled)/1024                             -0.0015         -0.0015       4585000       4578319       4584252       4577380
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/8                        +0.0333         +0.0334           135           139           135           139
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/1024                     +0.0009         +0.0009       1976553       1978340       1976050       1977906
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/8                         -0.2380         -0.2380           241           184           241           184
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/1024                      -0.3217         -0.3217       3198900       2169813       3198126       2169347
std::is_permutation(list<int>) (3leg, pred) (shuffled)/8                          +0.0170         +0.0172           125           127           125           127
std::is_permutation(list<int>) (3leg, pred) (shuffled)/1024                       -0.0017         -0.0016       4899881       4891630       4898641       4890573
std::is_permutation(vector<int>) (4leg) (shuffled)/8                              -0.0126         -0.0125            96            95            96            95
std::is_permutation(vector<int>) (4leg) (shuffled)/1024                           +0.0004         +0.0004        700149        700446        700132        700446
std::is_permutation(deque<int>) (4leg) (shuffled)/8                               +0.0041         +0.0041           234           235           234           235
std::is_permutation(deque<int>) (4leg) (shuffled)/1024                            +0.0003         +0.0004       3193933       3194955       3193682       3194899
std::is_permutation(list<int>) (4leg) (shuffled)/8                                +0.0029         +0.0029           178           178           178           178
std::is_permutation(list<int>) (4leg) (shuffled)/1024                             -0.0031         -0.0030       4586281       4572054       4584761       4571155
rng::is_permutation(vector<int>) (4leg) (shuffled)/8                              -0.0140         -0.0140            99            97            99            97
rng::is_permutation(vector<int>) (4leg) (shuffled)/1024                           +0.0001         +0.0000        698294        698336        698288        698315
rng::is_permutation(deque<int>) (4leg) (shuffled)/8                               -0.0699         -0.0699           237           221           237           221
rng::is_permutation(deque<int>) (4leg) (shuffled)/1024                            -0.0660         -0.0660       3195729       2984915       3195573       2984812
rng::is_permutation(list<int>) (4leg) (shuffled)/8                                -0.0006         -0.0005           178           178           178           178
rng::is_permutation(list<int>) (4leg) (shuffled)/1024                             -0.0026         -0.0024       4586701       4574790       4585880       4574676
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        +0.0573         +0.0573           114           121           114           121
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/1024                     -0.0005         -0.0004       1610369       1609640       1610279       1609599
std::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         -0.2122         -0.2122           230           181           230           181
std::is_permutation(deque<int>) (4leg, pred) (shuffled)/1024                      -0.3077         -0.3077       3197346       2213480       3197042       2213402
std::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          -0.0892         -0.0890           141           128           141           128
std::is_permutation(list<int>) (4leg, pred) (shuffled)/1024                       +0.0096         +0.0096       5003662       5051572       5003320       5051456
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        +0.0586         +0.0586           114           121           114           121
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/1024                     +0.0020         +0.0018       1607373       1610551       1607305       1610201
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         +0.0090         +0.0088           214           216           214           216
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/1024                      +0.0059         +0.0056       2883862       2901004       2883558       2899833
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          -0.0405         -0.0407           133           128           133           128
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/1024                       +0.0086         +0.0086       5020501       5063769       5019298       5062632
OVERALL_GEOMEAN                                                                   -0.0022         -0.0022             0             0             0             0

@imdj
Copy link
Contributor Author

imdj commented Mar 20, 2025

Indeed, this is the speedup I get when I drop std::ref from the mismatch call and ensure that we vectorize on AppleClang

I'm missing some piece of the puzzle. I couldn't reach those numbers. Like mentioned above in the benchmarks , I'm currently getting:

OVERALL_GEOMEAN                                                                   -0.0022         -0.0022             0             0             0             0

Just dropping std::ref() like such auto __result = std::mismatch(__first1, __last1, __first2, __pred); will lead to copies/failed tests and even then, the performance still took a hit:

OVERALL_GEOMEAN                                                                   -0.0013         -0.0013             0             0             0             0

I also tried using std::__mismatch directly to avoid making copies something like this:

__identity __ident;
auto __result = std::__mismatch(__first1, __last1, __first2, __pred, __ident, __ident);

but that also ended up with worse results

OVERALL_GEOMEAN                                                                   -0.0009         -0.0009             0             0             0             0

What am I doing wrong :(

@ldionne
Copy link
Member

ldionne commented Mar 20, 2025

@imdj What is your lit invocation for running the benchmark? Are you enabling optimizations?

Per https://libcxx.llvm.org/TestingLibcxx.html#benchmarks:

libcxx/utils/libcxx-lit <build> libcxx/test/benchmarks/algorithms/nonmodifying/is_permutation.bench.cpp --show-all --param optimization=speed

@imdj
Copy link
Contributor Author

imdj commented Mar 20, 2025

@imdj What is your lit invocation for running the benchmark? Are you enabling optimizations?

Per https://libcxx.llvm.org/TestingLibcxx.html#benchmarks:

libcxx/utils/libcxx-lit <build> libcxx/test/benchmarks/algorithms/nonmodifying/is_permutation.bench.cpp --show-all --param optimization=speed

Yep, the exact workflow. Then I compare my PR build against a fairly up-to-date llvm main repo. using libcxx/utils/libcxx-compare-benchmarks.

I'm noticing a lot of fluctuation though between runs. Something is probably off in my setup. I'll double check with the tips at: https://llvm.org/docs/Benchmarking.html.

@ldionne
Copy link
Member

ldionne commented Mar 20, 2025

I also notice a bit of fluctuation between runs for this algorithm, especially for std::list. So far I've assumed that this was due to cache effects. The fluctuation is around 10-15% of the benchmark time for std::list, and less for more contiguous data structures. If you're seeing a lot more fluctuation, something may be off somewhere (in your setup or in our benchmarks).

@imdj
Copy link
Contributor Author

imdj commented Mar 20, 2025

So, I compared few of the benchmarks results in pairs and ranked the top 10 with largest relative changes. The results overall match your findings @ldionne :

------------------------------------------------------------------------------------------------------------------------
Benchmark                                                         | Bad CPU (ns) | Good CPU (ns) | Abs Diff   | Change (%)     
------------------------------------------------------------------------------------------------------------------------
rng::is_permutation(vector<int>) (4leg) (shuffled)/1024             711634.00      698098.00      13536.00     1.90           
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/8          120.00         122.00         2.00         1.67           
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8192    41729.00       41151.00       578.00       1.39           
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8192    41816.00       41271.00       545.00       1.30           
std::is_permutation(list<int>) (4leg) (common prefix)/8192          41871.00       41359.00       512.00       1.22           
rng::is_permutation(list<int>) (4leg) (common prefix)/8192          41802.00       41295.00       507.00       1.21           
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8192    41976.00       41490.00       486.00       1.16           
std::is_permutation(list<int>) (4leg, pred) (common prefix)/1024    3841.00        3875.00        34.00        0.89           
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/8          120.00         121.00         1.00         0.83           
std::is_permutation(vector<int>) (3leg) (shuffled)/8                98.80          98.10          0.70         0.71

For reference, here's my build config:

cmake -G Ninja -S runtimes -B build -DCMAKE_BUILD_TYPE=Release -DLLVM_USE_LINKER=lld -DLLVM_BUILD_STATIC=ON -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi;libunwind"

Copy link

github-actions bot commented Mar 24, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@imdj
Copy link
Contributor Author

imdj commented Mar 24, 2025

I tried to do some more experiments, this time also comparing GCC 14.2.1 against Clang 19.1.7. I found that getting rid of the preemptive advancement of the iterator resulted in a significant boost (relative to the -0.0022 I was getting earlier) and enabled better optimizations in both compilers.

Let me know what you think and if there are any further suggestions. Here are the results after the change:

GCC v14 results (Mean Time: -0.1017)
Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
std::is_permutation(vector<int>) (3leg) (common prefix)/8                         +0.0698         +0.0700             6             6             6             6
std::is_permutation(vector<int>) (3leg) (common prefix)/1024                      -0.0043         -0.0043           529           527           529           526
std::is_permutation(vector<int>) (3leg) (common prefix)/8192                      -0.0013         -0.0011          4142          4136          4141          4136
std::is_permutation(deque<int>) (3leg) (common prefix)/8                          -0.0706         -0.0706            21            20            21            20
std::is_permutation(deque<int>) (3leg) (common prefix)/1024                       +0.0007         +0.0009          2480          2482          2479          2482
std::is_permutation(deque<int>) (3leg) (common prefix)/8192                       +0.0147         +0.0147         19809         20101         19809         20100
std::is_permutation(list<int>) (3leg) (common prefix)/8                           +0.0004         +0.0002             9             9             9             9
std::is_permutation(list<int>) (3leg) (common prefix)/1024                        +0.0038         +0.0038          3848          3863          3848          3863
std::is_permutation(list<int>) (3leg) (common prefix)/8192                        +0.0003         +0.0006         41776         41790         41766         41790
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8                   -0.0020         -0.0020             7             7             7             7
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/1024                -0.0017         -0.0019           632           631           632           630
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8192                -0.0042         -0.0042          4981          4960          4981          4960
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/8                    +0.0869         +0.0867            16            18            16            18
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/1024                 +0.0004         +0.0006          1257          1258          1257          1258
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/8192                 +0.0025         +0.0023          9979         10003          9979         10001
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8                     -0.0002         +0.0000            10            10            10            10
std::is_permutation(list<int>) (3leg, pred) (common prefix)/1024                  +0.0006         +0.0004          4212          4215          4212          4214
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8192                  +0.0153         +0.0155         40564         41183         40556         41183
std::is_permutation(vector<int>) (4leg) (common prefix)/8                         +0.0280         +0.0278            14            14            14            14
std::is_permutation(vector<int>) (4leg) (common prefix)/1024                      +0.0006         +0.0009           842           842           842           842
std::is_permutation(vector<int>) (4leg) (common prefix)/8192                      +0.0005         +0.0004          6605          6608          6605          6608
std::is_permutation(deque<int>) (4leg) (common prefix)/8                          -0.2560         -0.2560            36            27            36            27
std::is_permutation(deque<int>) (4leg) (common prefix)/1024                       -0.4996         -0.4996          2529          1266          2529          1266
std::is_permutation(deque<int>) (4leg) (common prefix)/8192                       -0.5466         -0.5467         22048          9996         22048          9994
std::is_permutation(list<int>) (4leg) (common prefix)/8                           +0.0004         +0.0004            13            13            13            13
std::is_permutation(list<int>) (4leg) (common prefix)/1024                        +0.0199         +0.0197          3866          3943          3866          3942
std::is_permutation(list<int>) (4leg) (common prefix)/8192                        +0.0060         +0.0060         40626         40872         40626         40872
rng::is_permutation(vector<int>) (4leg) (common prefix)/8                         +0.1060         +0.1058            15            17            15            17
rng::is_permutation(vector<int>) (4leg) (common prefix)/1024                      +0.0012         +0.0012           843           844           843           844
rng::is_permutation(vector<int>) (4leg) (common prefix)/8192                      -0.0001         -0.0001          6608          6608          6608          6608
rng::is_permutation(deque<int>) (4leg) (common prefix)/8                          -0.0317         -0.0315            37            36            37            36
rng::is_permutation(deque<int>) (4leg) (common prefix)/1024                       -0.0008         -0.0010          2909          2907          2909          2906
rng::is_permutation(deque<int>) (4leg) (common prefix)/8192                       -0.0015         -0.0013         23223         23188         23218         23188
rng::is_permutation(list<int>) (4leg) (common prefix)/8                           -0.0002         -0.0002            14            14            14            14
rng::is_permutation(list<int>) (4leg) (common prefix)/1024                        +0.0042         +0.0044          3839          3855          3838          3855
rng::is_permutation(list<int>) (4leg) (common prefix)/8192                        +0.0156         +0.0154         40622         41256         40622         41247
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/8                   -0.1135         -0.1133            11            10            11            10
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/1024                -0.0034         -0.0034           838           836           838           836
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/8192                -0.0011         -0.0011          6609          6602          6607          6600
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.0381         +0.0381            33            34            33            34
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 +0.3936         +0.3936          2084          2904          2083          2904
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.3959         +0.3959         16586         23152         16585         23152
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8                     +0.0003         +0.0001            12            12            12            12
std::is_permutation(list<int>) (4leg, pred) (common prefix)/1024                  +0.0153         +0.0153          3844          3903          3844          3903
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  -0.0056         -0.0056         41698         41465         41689         41455
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/8                   -0.1110         -0.1110            11            10            11            10
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/1024                -0.0034         -0.0036           838           836           838           835
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/8192                -0.0112         -0.0110          6673          6598          6671          6598
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.0533         +0.0531            33            35            33            35
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 +0.3932         +0.3935          2084          2903          2083          2903
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.3957         +0.3955         16587         23151         16587         23146
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8                     -0.0006         -0.0004            12            12            12            12
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/1024                  -0.0012         -0.0014          3885          3881          3885          3880
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  -0.0027         -0.0024         41501         41391         41492         41391
std::is_permutation(vector<int>) (3leg) (shuffled)/8                              -0.0077         -0.0077            95            95            95            95
std::is_permutation(vector<int>) (3leg) (shuffled)/1024                           -0.0083         -0.0081        706015        700140        705871        700142
std::is_permutation(deque<int>) (3leg) (shuffled)/8                               +0.0011         +0.0011           220           220           220           220
std::is_permutation(deque<int>) (3leg) (shuffled)/1024                            +0.0007         +0.0009       2985438       2987624       2984818       2987613
std::is_permutation(list<int>) (3leg) (shuffled)/8                                +0.0545         +0.0545           174           183           174           183
std::is_permutation(list<int>) (3leg) (shuffled)/1024                             +0.0017         +0.0019       4568861       4576605       4567919       4576614
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/8                        +0.0112         +0.0112           135           136           135           136
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/1024                     +0.0199         +0.0197       1975693       2015043       1975696       2014617
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/8                         -0.8860         -0.8860           241            27           241            27
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/1024                      -0.9993         -0.9993       3197946          2232       3197952          2231
std::is_permutation(list<int>) (3leg, pred) (shuffled)/8                          +0.0829         +0.0829           126           136           126           136
std::is_permutation(list<int>) (3leg, pred) (shuffled)/1024                       +0.0008         +0.0008       4891431       4895407       4890419       4894382
std::is_permutation(vector<int>) (4leg) (shuffled)/8                              +0.0351         +0.0351            96           100            96           100
std::is_permutation(vector<int>) (4leg) (shuffled)/1024                           +0.0015         +0.0013        700257        701309        700253        701165
std::is_permutation(deque<int>) (4leg) (shuffled)/8                               +0.0107         +0.0110           234           237           234           237
std::is_permutation(deque<int>) (4leg) (shuffled)/1024                            +0.0006         +0.0004       3195212       3197090       3195197       3196420
std::is_permutation(list<int>) (4leg) (shuffled)/8                                +0.0510         +0.0512           178           187           178           187
std::is_permutation(list<int>) (4leg) (shuffled)/1024                             +0.0025         +0.0023       4572262       4583785       4572271       4582837
rng::is_permutation(vector<int>) (4leg) (shuffled)/8                              +0.0969         +0.0972            97           107            97           107
rng::is_permutation(vector<int>) (4leg) (shuffled)/1024                           +0.0391         +0.0389        699073        726408        699069        726263
rng::is_permutation(deque<int>) (4leg) (shuffled)/8                               +0.0250         +0.0252           237           243           237           243
rng::is_permutation(deque<int>) (4leg) (shuffled)/1024                            -0.0002         -0.0003       3195994       3195224       3195984       3194958
rng::is_permutation(list<int>) (4leg) (shuffled)/8                                +0.0478         +0.0480           178           187           178           187
rng::is_permutation(list<int>) (4leg) (shuffled)/1024                             -0.0006         -0.0008       4568130       4565219       4568106       4564267
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        +0.2592         +0.2594           114           144           114           144
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/1024                     +0.2197         +0.2197       1614028       1968680       1614019       1968684
std::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         -0.2631         -0.2629           230           170           230           170
std::is_permutation(deque<int>) (4leg, pred) (shuffled)/1024                      -0.3086         -0.3086       3197877       2210887       3197836       2210891
std::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          -0.0263         -0.0263           141           137           141           137
std::is_permutation(list<int>) (4leg, pred) (shuffled)/1024                       +0.0190         +0.0190       4998838       5093826       4998848       5093802
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        +0.2610         +0.2607           114           144           114           144
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/1024                     +0.2236         +0.2239       1611720       1972173       1611370       1972142
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         -0.2045         -0.2046           214           171           214           170
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/1024                      -0.2350         -0.2348       2886941       2208494       2886321       2208498
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          +0.0041         +0.0041           142           142           142           142
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/1024                       +0.0173         +0.0171       5016855       5103609       5016866       5102603
OVERALL_GEOMEAN                                                                   -0.1017         -0.1017             0             0             0             0
Clang v19 results (Mean Time: -0.0432)
Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
std::is_permutation(vector<int>) (3leg) (common prefix)/8                         -0.1945         -0.1948             8             6             8             6
std::is_permutation(vector<int>) (3leg) (common prefix)/1024                      -0.7643         -0.7644           834           197           834           197
std::is_permutation(vector<int>) (3leg) (common prefix)/8192                      -0.7233         -0.7233          6601          1826          6599          1826
std::is_permutation(deque<int>) (3leg) (common prefix)/8                          -0.0177         -0.0174            21            20            21            20
std::is_permutation(deque<int>) (3leg) (common prefix)/1024                       -0.0006         -0.0006          2074          2073          2074          2073
std::is_permutation(deque<int>) (3leg) (common prefix)/8192                       -0.0014         -0.0011         16578         16555         16574         16555
std::is_permutation(list<int>) (3leg) (common prefix)/8                           +0.0014         +0.0012            10            10            10            10
std::is_permutation(list<int>) (3leg) (common prefix)/1024                        +0.0006         +0.0009          3100          3102          3099          3102
std::is_permutation(list<int>) (3leg) (common prefix)/8192                        -0.0033         -0.0035         30416         30314         30415         30308
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8                   +0.0045         +0.0045            10            10            10            10
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/1024                +0.0003         +0.0003          1255          1256          1255          1256
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8192                -0.0040         -0.0040         10225         10184         10225         10184
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/8                    -0.1690         -0.1688            24            20            24            20
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/1024                 -0.3387         -0.3389          2491          1647          2490          1647
std::is_permutation(deque<int>) (3leg, pred) (common prefix)/8192                 -0.3220         -0.3219         19882         13480         19878         13480
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8                     +0.4666         +0.4663            15            22            15            22
std::is_permutation(list<int>) (3leg, pred) (common prefix)/1024                  +0.0049         +0.0051          3387          3403          3386          3403
std::is_permutation(list<int>) (3leg, pred) (common prefix)/8192                  -0.0165         -0.0167         33380         32829         33379         32823
std::is_permutation(vector<int>) (4leg) (common prefix)/8                         +0.2052         +0.2055            13            16            13            16
std::is_permutation(vector<int>) (4leg) (common prefix)/1024                      -0.1408         -0.1409           980           842           980           842
std::is_permutation(vector<int>) (4leg) (common prefix)/8192                      -0.1459         -0.1459          7732          6604          7732          6604
std::is_permutation(deque<int>) (4leg) (common prefix)/8                          -0.0580         -0.0582            29            27            29            27
std::is_permutation(deque<int>) (4leg) (common prefix)/1024                       -0.4535         -0.4535          2312          1263          2312          1263
std::is_permutation(deque<int>) (4leg) (common prefix)/8192                       -0.3989         -0.3989         16638         10000         16638         10000
std::is_permutation(list<int>) (4leg) (common prefix)/8                           -0.1670         -0.1672            19            15            19            15
std::is_permutation(list<int>) (4leg) (common prefix)/1024                        -0.0576         -0.0576          3308          3117          3308          3117
std::is_permutation(list<int>) (4leg) (common prefix)/8192                        -0.0437         -0.0439         30965         29613         30965         29607
rng::is_permutation(vector<int>) (4leg) (common prefix)/8                         +0.3785         +0.3785            15            20            15            20
rng::is_permutation(vector<int>) (4leg) (common prefix)/1024                      +0.2810         +0.2807           980          1255           980          1255
rng::is_permutation(vector<int>) (4leg) (common prefix)/8192                      +0.2788         +0.2791          7742          9901          7740          9900
rng::is_permutation(deque<int>) (4leg) (common prefix)/8                          +0.1996         +0.1993            30            36            30            36
rng::is_permutation(deque<int>) (4leg) (common prefix)/1024                       +0.1962         +0.1964          2086          2496          2086          2496
rng::is_permutation(deque<int>) (4leg) (common prefix)/8192                       +0.1984         +0.1982         16579         19869         16578         19864
rng::is_permutation(list<int>) (4leg) (common prefix)/8                           -0.1446         -0.1444            20            17            20            17
rng::is_permutation(list<int>) (4leg) (common prefix)/1024                        -0.0227         -0.0229          3298          3223          3298          3222
rng::is_permutation(list<int>) (4leg) (common prefix)/8192                        +0.0217         +0.0219         30209         30865         30203         30864
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/8                   -0.0525         -0.0527            14            14            14            14
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/1024                -0.0028         -0.0028          1262          1259          1262          1259
std::is_permutation(vector<int>) (4leg, pred) (common prefix)/8192                -0.0022         -0.0022         10214         10192         10214         10192
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.0227         +0.0227            34            35            34            35
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 -0.2779         -0.2779          2332          1684          2332          1684
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 -0.2484         -0.2485         18256         13721         18256         13719
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8                     +0.2689         +0.2692            18            23            18            23
std::is_permutation(list<int>) (4leg, pred) (common prefix)/1024                  +0.0153         +0.0151          3392          3444          3392          3443
std::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  -0.0004         -0.0004         32784         32772         32784         32772
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/8                   -0.0001         -0.0002            13            13            13            13
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/1024                -0.0020         -0.0018          1260          1257          1260          1257
rng::is_permutation(vector<int>) (4leg, pred) (common prefix)/8192                -0.0018         -0.0020         10212         10193         10212         10191
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.1278         +0.1280            35            40            35            40
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 +0.3906         +0.3904          2389          3322          2389          3321
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.4481         +0.4484         18273         26461         18269         26460
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8                     +0.0369         +0.0367            18            18            18            18
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/1024                  +0.0184         +0.0184          3361          3422          3360          3422
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  +0.0048         +0.0046         32902         33058         32901         33052
std::is_permutation(vector<int>) (3leg) (shuffled)/8                              +0.0309         +0.0311            89            92            89            92
std::is_permutation(vector<int>) (3leg) (shuffled)/1024                           +0.0018         +0.0016        740186        741545        740187        741391
std::is_permutation(deque<int>) (3leg) (shuffled)/8                               +0.1159         +0.1159           207           231           207           231
std::is_permutation(deque<int>) (3leg) (shuffled)/1024                            +0.0364         +0.0362       2880767       2985747       2880763       2985161
std::is_permutation(list<int>) (3leg) (shuffled)/8                                -0.0325         -0.0325           124           120           124           120
std::is_permutation(list<int>) (3leg) (shuffled)/1024                             +0.0028         +0.0026       4548789       4561714       4548749       4560792
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/8                        -0.0345         -0.0345           179           173           179           173
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/1024                     +0.0062         +0.0064       2518745       2534408       2518246       2534367
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/8                         -0.0345         -0.0345           220           212           220           212
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/1024                      -0.0335         -0.0335       3255446       3146391       3255453       3146396
std::is_permutation(list<int>) (3leg, pred) (shuffled)/8                          +0.2016         +0.2016           240           288           240           288
std::is_permutation(list<int>) (3leg, pred) (shuffled)/1024                       +0.0732         +0.0734       5245407       5629350       5244347       5629361
std::is_permutation(vector<int>) (4leg) (shuffled)/8                              -0.0712         -0.0713            95            88            95            88
std::is_permutation(vector<int>) (4leg) (shuffled)/1024                           -0.0154         -0.0152        748810        737291        748663        737287
std::is_permutation(deque<int>) (4leg) (shuffled)/8                               -0.1044         -0.1046           170           152           170           152
std::is_permutation(deque<int>) (4leg) (shuffled)/1024                            -0.1254         -0.1254       2352579       2057488       2352583       2057480
std::is_permutation(list<int>) (4leg) (shuffled)/8                                +0.1322         +0.1322           109           124           109           124
std::is_permutation(list<int>) (4leg) (shuffled)/1024                             -0.0016         -0.0016       4546353       4539198       4546362       4539183
rng::is_permutation(vector<int>) (4leg) (shuffled)/8                              -0.0128         -0.0128            90            89            90            89
rng::is_permutation(vector<int>) (4leg) (shuffled)/1024                           -0.0044         -0.0044        740259        737016        740250        737017
rng::is_permutation(deque<int>) (4leg) (shuffled)/8                               -0.2595         -0.2594           205           152           205           152
rng::is_permutation(deque<int>) (4leg) (shuffled)/1024                            -0.2895         -0.2896       2895451       2057357       2895438       2056940
rng::is_permutation(list<int>) (4leg) (shuffled)/8                                -0.0137         -0.0135           126           125           126           125
rng::is_permutation(list<int>) (4leg) (shuffled)/1024                             -0.0014         -0.0016       4546535       4540369       4546517       4539459
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        -0.0409         -0.0407           182           174           181           174
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/1024                     +0.0039         +0.0037       2525031       2534863       2525025       2534369
std::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         +0.0279         +0.0281           232           238           232           238
std::is_permutation(deque<int>) (4leg, pred) (shuffled)/1024                      -0.0610         -0.0612       3468556       3256934       3468542       3256312
std::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          +0.0950         +0.0952           187           205           187           205
std::is_permutation(list<int>) (4leg, pred) (shuffled)/1024                       +0.0738         +0.0736       5251369       5638866       5251348       5637750
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        -0.0417         -0.0417           181           174           181           174
rng::is_permutation(vector<int>) (4leg, pred) (shuffled)/1024                     +0.0051         +0.0049       2522761       2535627       2522749       2535110
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         -0.0175         -0.0175           225           221           225           221
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/1024                      -0.0944         -0.0946       3470871       3143225       3470857       3142614
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          +0.0917         +0.0919           240           262           240           262
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/1024                       +0.0756         +0.0756       5237991       5634083       5238002       5634094
OVERALL_GEOMEAN                                                                   -0.0432         -0.0432             0             0             0             0

@imdj
Copy link
Contributor Author

imdj commented Mar 25, 2025

Indeed, this is the speedup I get when I drop std::ref from the mismatch call and ensure that we vectorize on AppleClang:

Benchmark                                                                      Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------
std::is_permutation(vector<int>) (3leg) (common prefix)/8                   -0.4075         -0.4075             4             2             4             2
std::is_permutation(vector<int>) (3leg) (common prefix)/1024                -0.8155         -0.8154           548           101           547           101
std::is_permutation(vector<int>) (3leg) (common prefix)/8192                -0.7890         -0.7887          3925           828          3918           828
OVERALL_GEOMEAN                                                             -0.9526         -0.9526             0             0             0             0

Could the diff in results be originating from using a different commit as base for old benchmarks? I'm using 9b1f905 as base

@imdj imdj requested a review from ldionne March 25, 2025 19:20
Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, make sure to benchmark on the latest main since I recently fixed two issues where we wouldn't vectorize properly inside mismatch. I pulled your branch and rebased it onto main just now, and the algorithms I get that do worse are the following (I dropped all the lines where your patch was an improvement):

Comparing build/default/libcxx/test/benchmarks/algorithms/nonmodifying/Output/is_permutation.bench.cpp.dir/benchmark-result.json to build/candidate/libcxx/test/benchmarks/algorithms/nonmodifying/Output/is_permutation.bench.cpp.dir/benchmark-result.json
Benchmark                                                                            Time             CPU      Time Old      Time New       CPU Old       CPU New
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
std::is_permutation(list<int>) (3leg) (common prefix)/8                           +0.0801         +0.0824             5             5             5             5
std::is_permutation(list<int>) (3leg) (common prefix)/1024                        +0.4706         +0.4731          1088          1599          1086          1599
std::is_permutation(list<int>) (3leg) (common prefix)/8192                        +0.1887         +0.1899         11456         13618         11445         13618
std::is_permutation(list<int>) (3leg, pred) (common prefix)/1024                  +0.0336         +0.0338          1137          1176          1137          1175
rng::is_permutation(list<int>) (4leg, pred) (common prefix)/8192                  +0.1133         +0.1139         12519         13938         12512         13937
std::is_permutation(list<int>) (3leg) (shuffled)/8                                +0.0699         +0.0701            61            65            61            65
std::is_permutation(list<int>) (3leg, pred) (shuffled)/1024                       +0.0281         +0.0289       2234288       2297102       2232489       2296954
std::is_permutation(list<int>) (4leg) (shuffled)/8                                +0.0703         +0.0723            61            65            61            65
rng::is_permutation(list<int>) (4leg) (shuffled)/8                                +0.0819         +0.0818            60            65            60            65
std::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          +0.2659         +0.2659            76            96            76            96
rng::is_permutation(list<int>) (4leg, pred) (shuffled)/8                          +0.2426         +0.2472            77            96            77            96
std::is_permutation(deque<int>) (4leg) (common prefix)/8                          +0.2592         +0.2600            12            15            12            15
std::is_permutation(deque<int>) (4leg) (common prefix)/1024                       +0.5906         +0.5919           818          1301           817          1301
std::is_permutation(deque<int>) (4leg) (common prefix)/8192                       +0.5919         +0.5925          6456         10277          6453         10276
rng::is_permutation(deque<int>) (4leg) (common prefix)/8                          +0.4652         +0.4659            10            15            10            15
rng::is_permutation(deque<int>) (4leg) (common prefix)/1024                       +0.5856         +0.5861           812          1288           812          1288
rng::is_permutation(deque<int>) (4leg) (common prefix)/8192                       +0.5919         +0.5921          6443         10256          6441         10255
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.3581         +0.3591            11            16            11            16
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 +0.4977         +0.4986           862          1291           861          1291
std::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.5084         +0.5102          6826         10296          6817         10295
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8                    +0.3754         +0.3763            11            16            11            16
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/1024                 +0.5137         +0.5142           852          1290           852          1290
rng::is_permutation(deque<int>) (4leg, pred) (common prefix)/8192                 +0.5136         +0.5144          6771         10248          6767         10248
std::is_permutation(deque<int>) (3leg) (shuffled)/8                               +0.0634         +0.0639            73            78            73            78
std::is_permutation(deque<int>) (3leg, pred) (shuffled)/8                         +0.0260         +0.0275            81            83            81            83
rng::is_permutation(deque<int>) (4leg, pred) (shuffled)/8                         +0.3511         +0.3530            81           109            81           109
std::is_permutation(vector<int>) (3leg, pred) (common prefix)/8                   +0.0183         +0.0172             4             4             4             4
std::is_permutation(vector<int>) (3leg) (shuffled)/8                              +0.1508         +0.1512            49            56            49            56
std::is_permutation(vector<int>) (3leg, pred) (shuffled)/8                        +0.0585         +0.0604            62            65            62            65
std::is_permutation(vector<int>) (4leg) (shuffled)/8                              +0.0977         +0.0989            49            54            49            54
rng::is_permutation(vector<int>) (4leg) (shuffled)/8                              +0.1512         +0.1525            49            56            49            56
std::is_permutation(vector<int>) (4leg, pred) (shuffled)/8                        +0.0705         +0.0721            61            66            61            66
  • First, we can observe that vector<int> is only doing worse on very small sequences. That's actually a particularity of this benchmark, it operates on pretty small sequences since is_permutation is so expensive. I think we can mostly disregard the slowdown for vector<int> since it only affects 8 element sequences. I suspect that making our vectorized mismatch faster on small sequences would solve the problem here.
  • Second, we can see that we're doing worse on several benchmarks that check the common prefix pattern. But with that data pattern, the algorithm should be dominated by mismatch. So I think we need to understand why our current std::mismatch behaves worse on std::deque than the hand-written loop that existed in std::is_permutation before your patch. I think you could also validate that switching from the hand-written loop to std::mismatch is the cause of the slowdown by locally reverting just that part of the change and seeing if the before/after benchmarks are better for std::deque on common prefix. BTW you can locally edit the benchmark to only run a subset of all the combinations in order to iterate more quickly.
  • Last, we are also doing worse on list with the common prefix pattern, I suspect we might be hitting the same issue as deque.

So TLDR, I'd focus on confirming that std::mismatch is slower on deque and list than a naive hand-written loop, and go from there.

if (std::__invoke(__pred, std::__invoke(__proj1, *__match), std::__invoke(__proj1, *__i)))
break;
}
auto __match = std::find_if(__first1, __i, [&](_Ref1 __x) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably hold on to the result of *__i. Something like _Ref1 __va = *__i. This simplifies the code a bit and might be a bit faster depending on the kind of iterator.

@imdj imdj changed the title [libc++] Tiny optimizations for is_permutation [libc++] Optimizing is_permutation Mar 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants