|
1 |
| -# Partial Sort |
| 1 | +# Partial Sort (sortedPrefix) |
2 | 2 |
|
3 | 3 | [[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/PartialSort.swift) |
|
4 | 4 | [Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/PartialSortTests.swift)]
|
5 | 5 |
|
6 |
| -Returns a collection such that the `0...k` range contains the first k sorted elements of a sequence. |
7 |
| -The order of equal elements is not guaranteed to be preserved, and the order of the remaining elements is unspecified. |
| 6 | +Returns the first k elements of this collection when it's sorted. |
8 | 7 |
|
9 |
| -If you need to sort a sequence but only need access to a prefix of its elements, |
10 |
| -using this method can give you a performance boost over sorting the entire sequence. |
| 8 | +If you need to sort a collection but only need access to a prefix of its |
| 9 | +elements, using this method can give you a performance boost over sorting |
| 10 | +the entire collection. The order of equal elements is guaranteed to be |
| 11 | +preserved. |
11 | 12 |
|
12 | 13 | ```swift
|
13 | 14 | let numbers = [7,1,6,2,8,3,9]
|
14 |
| -let almostSorted = numbers.partiallySorted(3, <) |
15 |
| -// [1, 2, 3, 9, 7, 6, 8] |
| 15 | +let smallestThree = numbers.sortedPrefix(<) |
| 16 | +// [1, 2, 3] |
16 | 17 | ```
|
17 | 18 |
|
18 | 19 | ## Detailed Design
|
19 | 20 |
|
20 |
| -This adds the in-place `MutableCollection` method shown below: |
| 21 | +This adds the `Collection` method shown below: |
21 | 22 |
|
22 | 23 | ```swift
|
23 |
| -extension Sequence { |
24 |
| - func partiallySort(_ count: Int, by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows |
| 24 | +extension Collection { |
| 25 | + public func sortedPrefix(_ count: Int, by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows -> [Element] |
25 | 26 | }
|
26 | 27 | ```
|
27 | 28 |
|
28 |
| -Additionally, versions of this method that return a new array and abstractions for `Comparable` types are also provided: |
| 29 | +Additionally, a version of this method for `Comparable` types is also provided: |
29 | 30 |
|
30 | 31 | ```swift
|
31 |
| -extension MutableCollection where Self: RandomAccessCollection, Element: Comparable { |
32 |
| - public mutating func partiallySort(_ count: Int) |
33 |
| -} |
34 |
| - |
35 |
| -extension Sequence { |
36 |
| - public func partiallySorted(_ count: Int, by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows -> [Element] |
37 |
| -} |
38 |
| - |
39 |
| -extension Sequence where Element: Comparable { |
40 |
| - public func partiallySorted(_ count: Int) -> [Element] |
| 32 | +extension Collection where Element: Comparable { |
| 33 | + public func sortedPrefix(_ count: Int) -> [Element] |
41 | 34 | }
|
42 | 35 | ```
|
43 | 36 |
|
44 | 37 | ### Complexity
|
45 | 38 |
|
46 |
| -Partially sorting is a O(_k log n_) operation, where _k_ is the number of elements to sort |
47 |
| -and _n_ is the length of the sequence. |
| 39 | +The algorithm used is based on [Soroush Khanlou's research on this matter](https://khanlou.com/2018/12/analyzing-complexity/). The total complexity is `O(k log k + nk)`, which will result in a runtime close to `O(n)` if k is a small amount. If k is a large amount (more than 10% of the collection), we fallback to sorting the entire array. Realistically, this means the worst case is actually `O(n log n)`. |
| 40 | + |
| 41 | +Here are some benchmarks we made that demonstrates how this implementation (SmallestM) behaves when k increases (before implementing the fallback): |
48 | 42 |
|
49 |
| -`partiallySort(_:by:)` is a slight generalization of a priority queue. It's implemented |
50 |
| -as an in-place heapsort that stops after _k_ runs. |
| 43 | + |
| 44 | + |
51 | 45 |
|
52 | 46 | ### Comparison with other languages
|
53 | 47 |
|
54 |
| -**C++:** The `<algorithm>` library defines a `partial_sort` function with similar |
55 |
| -semantics to this one. |
| 48 | +**C++:** The `<algorithm>` library defines a `partial_sort` function where the entire array is returned. |
56 | 49 |
|
57 | 50 | **Python:** Defines a `heapq` priority queue that can be used to manually
|
58 | 51 | achieve the same result.
|
|
0 commit comments