Skip to content

Add resort to slice/vec #16557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Add resort to slice/vec #16557

wants to merge 1 commit into from

Conversation

Gankra
Copy link
Contributor

@Gankra Gankra commented Aug 17, 2014

We have a perfectly good implementation of insertion sort available, but don't expose it publicly. It's a great sorting algorithm if the data in known to be mostly sorted (e.g. it was already sorted, and we've only touched a few elements).

I think the benchmarks speak for themselves. On sorted data it completely destroys sort (because it just reads the damn thing and calls it a day). Benchmark results are from make check-collections PLEASE_BENCH=1 NO_REBUILD=1, Can rebuild with different flags if desired:

test slice::bench::resort_big_sorted                       ... bench:     54309 ns/iter (+/- 4181) = 5892 MB/s
test slice::bench::resort_sorted                           ... bench:     43509 ns/iter (+/- 376) = 1838 MB/s

test slice::bench::sort_big_sorted                         ... bench:   1721006 ns/iter (+/- 366156) = 185 MB/s
test slice::bench::sort_random_large                       ... bench:   2506750 ns/iter (+/- 115227) = 31 MB/s

Other ops: (note: resort is actually competitive on random data up to 100 elements!)

test slice::bench::resort_big_random_large                 ... bench: 115285904 ns/iter (+/- 2612076) = 2 MB/s
test slice::bench::resort_big_random_medium                ... bench:     15938 ns/iter (+/- 478) = 200 MB/s
test slice::bench::resort_big_random_small                 ... bench:       445 ns/iter (+/- 20) = 359 MB/s
test slice::bench::resort_random_large                     ... bench:  44449508 ns/iter (+/- 747584) = 1 MB/s
test slice::bench::resort_random_medium                    ... bench:      9628 ns/iter (+/- 119) = 83 MB/s
test slice::bench::resort_random_small                     ... bench:       285 ns/iter (+/- 5) = 140 MB/s

test slice::bench::sort_big_random_large                   ... bench:   4066801 ns/iter (+/- 535054) = 78 MB/s
test slice::bench::sort_big_random_medium                  ... bench:     15441 ns/iter (+/- 549) = 207 MB/s
test slice::bench::sort_big_random_small                   ... bench:       484 ns/iter (+/- 7) = 330 MB/s
test slice::bench::sort_random_medium                      ... bench:     13135 ns/iter (+/- 574) = 60 MB/s
test slice::bench::sort_random_small                       ... bench:       346 ns/iter (+/- 15) = 115 MB/s
test slice::bench::sort_sorted                             ... bench:    726430 ns/iter (+/- 52568) = 110 MB/s

I didn't write any new benches to test e.g. "a couple elements are out of order". I can if it's desired. I also didn't add any unit tests since resort is implicitly tested by sort, for the most part.

@alexcrichton
Copy link
Member

You can find some previous discussion of adding various sorting algorithms in #15380, but the conclusion there was that the stdlib should provide one general-purpose sort and we should perhaps have an optional/external crate which provides more specialized sorting such as this.

This isn't the exact same as #15380, however, but I would likely fall on the same side of the line as the discussion in that PR.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 18, 2014

I can accept not including this from a simplicity perspective. I've personally run into several contexts where resorting is valuable, but I can certainly understand viewing it as niche.

Honestly, my primary motivation is that it's just there. We've implemented it, but relegated it to working on small inputs internally. I don't think it would introduce much maintenance expense, since basically any algorithm we use for sort would love to have it as a subroutine for small inputs.

That, and the orders of magnitude performance gain that one gets when it's appropriate. In contrast to the debate between quicksort and mergesort, insertion sort fills a very different niche.

Regardless, I bow to the will of core maintainers. I'll leave this open for a bit in the interest of discussion. Unless a clear counter-position forms, I'll likely close this in a day or two.

@arthurprs
Copy link
Contributor

As far as I'm aware in the current state of the stdlib, if you need to keep some items ordered you have the option to heapify it. It's a no-go if you need random access though.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 18, 2014

@arthurprs One example I had was a naive maintenance of object ordering in a dynamic environment with unpredictable movement. For instance a z-buffer for render-order in a game. You need the full ordering whenever it's requested, and if it's requested frequently, you can reasonably expect that relative ordering hasn't changed much. So just insertion-sort the old ordering, and you're golden. If elements are added or removed, you just append them to the end, or swap_pop them, each one introducing at most O(n) inversions each. If number of entities is fairly static, this is tolerable.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 19, 2014

./kill_all_PRs.exe

@Gankra Gankra closed this Aug 19, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants