Add resort to slice/vec #16557

Gankra · 2014-08-17T16:07:14Z

We have a perfectly good implementation of insertion sort available, but don't expose it publicly. It's a great sorting algorithm if the data in known to be mostly sorted (e.g. it was already sorted, and we've only touched a few elements).

I think the benchmarks speak for themselves. On sorted data it completely destroys sort (because it just reads the damn thing and calls it a day). Benchmark results are from make check-collections PLEASE_BENCH=1 NO_REBUILD=1, Can rebuild with different flags if desired:

test slice::bench::resort_big_sorted                       ... bench:     54309 ns/iter (+/- 4181) = 5892 MB/s
test slice::bench::resort_sorted                           ... bench:     43509 ns/iter (+/- 376) = 1838 MB/s

test slice::bench::sort_big_sorted                         ... bench:   1721006 ns/iter (+/- 366156) = 185 MB/s
test slice::bench::sort_random_large                       ... bench:   2506750 ns/iter (+/- 115227) = 31 MB/s

Other ops: (note: resort is actually competitive on random data up to 100 elements!)

test slice::bench::resort_big_random_large                 ... bench: 115285904 ns/iter (+/- 2612076) = 2 MB/s
test slice::bench::resort_big_random_medium                ... bench:     15938 ns/iter (+/- 478) = 200 MB/s
test slice::bench::resort_big_random_small                 ... bench:       445 ns/iter (+/- 20) = 359 MB/s
test slice::bench::resort_random_large                     ... bench:  44449508 ns/iter (+/- 747584) = 1 MB/s
test slice::bench::resort_random_medium                    ... bench:      9628 ns/iter (+/- 119) = 83 MB/s
test slice::bench::resort_random_small                     ... bench:       285 ns/iter (+/- 5) = 140 MB/s

test slice::bench::sort_big_random_large                   ... bench:   4066801 ns/iter (+/- 535054) = 78 MB/s
test slice::bench::sort_big_random_medium                  ... bench:     15441 ns/iter (+/- 549) = 207 MB/s
test slice::bench::sort_big_random_small                   ... bench:       484 ns/iter (+/- 7) = 330 MB/s
test slice::bench::sort_random_medium                      ... bench:     13135 ns/iter (+/- 574) = 60 MB/s
test slice::bench::sort_random_small                       ... bench:       346 ns/iter (+/- 15) = 115 MB/s
test slice::bench::sort_sorted                             ... bench:    726430 ns/iter (+/- 52568) = 110 MB/s

I didn't write any new benches to test e.g. "a couple elements are out of order". I can if it's desired. I also didn't add any unit tests since resort is implicitly tested by sort, for the most part.

alexcrichton · 2014-08-18T04:54:37Z

You can find some previous discussion of adding various sorting algorithms in #15380, but the conclusion there was that the stdlib should provide one general-purpose sort and we should perhaps have an optional/external crate which provides more specialized sorting such as this.

This isn't the exact same as #15380, however, but I would likely fall on the same side of the line as the discussion in that PR.

Gankra · 2014-08-18T12:27:11Z

I can accept not including this from a simplicity perspective. I've personally run into several contexts where resorting is valuable, but I can certainly understand viewing it as niche.

Honestly, my primary motivation is that it's just there. We've implemented it, but relegated it to working on small inputs internally. I don't think it would introduce much maintenance expense, since basically any algorithm we use for sort would love to have it as a subroutine for small inputs.

That, and the orders of magnitude performance gain that one gets when it's appropriate. In contrast to the debate between quicksort and mergesort, insertion sort fills a very different niche.

Regardless, I bow to the will of core maintainers. I'll leave this open for a bit in the interest of discussion. Unless a clear counter-position forms, I'll likely close this in a day or two.

arthurprs · 2014-08-18T21:08:53Z

As far as I'm aware in the current state of the stdlib, if you need to keep some items ordered you have the option to heapify it. It's a no-go if you need random access though.

Gankra · 2014-08-18T21:15:05Z

@arthurprs One example I had was a naive maintenance of object ordering in a dynamic environment with unpredictable movement. For instance a z-buffer for render-order in a game. You need the full ordering whenever it's requested, and if it's requested frequently, you can reasonably expect that relative ordering hasn't changed much. So just insertion-sort the old ordering, and you're golden. If elements are added or removed, you just append them to the end, or swap_pop them, each one introducing at most O(n) inversions each. If number of entities is fairly static, this is tolerable.

Gankra · 2014-08-19T13:03:21Z

./kill_all_PRs.exe

add resort method to Vec and Slice

b101cc2

Gankra closed this Aug 19, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add resort to slice/vec #16557

Add resort to slice/vec #16557

Uh oh!

Gankra commented Aug 17, 2014

Uh oh!

alexcrichton commented Aug 18, 2014

Uh oh!

Gankra commented Aug 18, 2014

Uh oh!

arthurprs commented Aug 18, 2014

Uh oh!

Gankra commented Aug 18, 2014

Uh oh!

Gankra commented Aug 19, 2014

Uh oh!

Uh oh!

Add resort to slice/vec #16557

Add resort to slice/vec #16557

Uh oh!

Conversation

Gankra commented Aug 17, 2014

Uh oh!

alexcrichton commented Aug 18, 2014

Uh oh!

Gankra commented Aug 18, 2014

Uh oh!

arthurprs commented Aug 18, 2014

Uh oh!

Gankra commented Aug 18, 2014

Uh oh!

Gankra commented Aug 19, 2014

Uh oh!

Uh oh!