Skip to content

Benchmarks

Morwenn edited this page Feb 6, 2016 · 25 revisions

Benchmarking is hard and I might not be doing it right. Moreover, benchmarking sorting algorithms highlights that the time needed to sort a collection of elements depends on several things: the type to sort, the size of collection, the cost of comparing two values, the cost of moving two values, the distribution of the values in the collection to sort, the collection itself, etc... The aim of this page is to help you choose a sorting algorithm depending on your needs. There is no clear structure to this page: we will mostly see different benchmarks and try to analyze them a bit.

All of the graphs on this page have been generated with slightly modified versions of the scripts found in the project's benchmarks folder. There are just too many things to check; if you ever want a specific benchmark, don't hesitate to ask for it.

Random-access iterables

Most sorting algorithms are designed to work with random-access iterators, so this section is deemed to be bigger than the other ones. Note that many of the algorithms work better with contiguous iterators; the resuts for std::vector and std::deque are probably different.

Unstable sort for std::vector with 10⁶ int

As we can see, even though its algorithmic complexity is excellent, smooth_sorter is simply too slow on average, as is heap_sorter has the same problems. std_sorter actually performs worse than expected (the libstdc++ one). The best sorting algorithms for this configuration are pdq_sorter, quick_sorter, spread_sorter and verge_sorter.

spread_sorter is by far the best sorter when it comes to sorting truly random collections and is excellent when the number of values in the collection is low (16 values in the examples). When there are pipe-organ patterns, verge_sorter seems to be the best match. That said, both of them use additional memory; pqd_sorter is the best sorter if a small memory footprint is needed.

Unstable sort for std::vector with 10⁷ int

I removed smooth_sorter and heap_sorter from this benchmark since their performance wasn't better than in the previous and made the graph harder to read. Compared to the previous graph, a few things have changed: for example, while it's still better than anything else for random distributions, spread_sorter has bad performance for the Alternating distribution. My guess is that it has to do with the fact that it's a radix-sort kind of algorithm and the fact that the Alternating case has both negative values and the widest range of values among the distributions doesn't help.

Both std_sorter seems to reach some kind of worst case behaviour for some distributions, making it less usable than the other unstable sorters in this test case. To be honest, quick_sorter should also have some worst cases, but the median-of-9 pivot selection makes it harder to find such worst cases.

Unstable sort for std::deque with 10⁶ int

The results for std::deque are close to those for std::vector. The most notable difference was that heap_sorter and smooth_sorter were even slower, so I didn't include them for readability. Apparently, random-access has a noticeable cost when the iterators are not contiguous. Otherwise, the conclusions are pretty much the same than with std::vector: when the distribution is truly random, spread_sorter is still the obvious winner, while verge_sorter beats distrubtions where big chunks are already sorted. pdq_sorter is still the best match when a small memory footprint is required. std_sorter still has some kind of worst case behaviour for the Pipe organ, Push front and Push middle patterns.

Bidirectional iterables

Sorting algorithms that handle non-random-access iterators are generally second class citizens, but cpp-sort still has a few ones. Obviously quadratic algorithms such as insertion sort or selection sort are ignored for these benchmarks. The most interesting part is that we can see how generic sorting algorithms perform compared to algorithms such as std::list::sort which are aware of the data structure they are sorting.

Sort std::list with 10⁶ int

First of all, note that default_sorter uses self_sort_adapter, which means that in this benchmark, the sorting algorithm used by default_sorter is actually std::list::sort. Apparently, std::list::sort is efficient for many patterns, but it doesn't handle random distributions that much and also has some problems with Alternating (16 values). On the other hand, quick_sorter works really well when there are few values, but doesn't really handle descending patterns. verge_sorter is rather good when there are not many runs in the collection to sort, and falls back on quick_sorter when the runs aren't big enough, which means that it's good for some patterns, but always slower than quick_sorter otherwise (it's still decent when there aren't many values). Finally, merge_sorter is pretty much always the same, no matter which pattern it has to sort.

Sort std::list with 10⁶ long double

The results are pretty much the same than with integers. The most important difference is that std::list::sort isn't much slower since it does not move/swap values but relinks nodes instead, so no matter the size of the types to sort, relinking always has the same cost. Therefore, it's a bit better than with integers compared to the other sorting algorithms; I expect it to be the better and better when the types to sort are more expensive to move around.

Forward iterables

Even fewer sorters can handle forward iterators; just like with bidirectional iterators, we only care about the sorters that are not almost always quadratic, which means the same than before, except verge_sorter since it sometimes needs to call std::reverse which requires bidirectional iterators.

Sort std::forward_list with 10⁶ int

As expected, default_sorter actually calls std::forward_list::sort, which is a bit better than with bidirectional iterators compared to the other sorting algorithms, probably because it has fewer nodes to relink. quick_sorter is still excellent when there are not many different values in the collection to sort but is less efficient than std::forward_list::sort for almost every specific pattern. merge_sorter is pretty good when the collection to sort is mostly ascending since it adapts to ascending runs; its worst case is better than std::forward_list::sort's one.

Sort std::foward_list with 10⁶ long double

There is almost no difference compared to the previous graph except the overall time. Just like with bidirectional iterators, I would expect std::forward_list::sort to perform gradually better when the objects to sort are more expensive to move around.

Clone this wiki locally