-
Notifications
You must be signed in to change notification settings - Fork 448
Add "sortedPrefix(_:by)" to Collection #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
5429d3b
Add partial sort algorithm
rakaramos 4362197
Add in place partial sorting
rockbruno f299df1
Guide docs
rockbruno 6cd2870
Use Indexes
rockbruno 63b2dd0
Merge pull request #1 from rakaramos/guide
rakaramos 88216e1
Add partial sort tests
rakaramos afe7111
Indent up to 80 columns
rakaramos 4652ae7
Fix heapify stopping before it should
rockbruno 37d494a
Update PartialSort.md
rockbruno 83d5f1e
Update PartialSort.md
rockbruno bf31ba1
Update PartialSort.swift
rockbruno acb3583
Cleaning up iterators logic
rockbruno 6227bd8
Update PartialSort.swift
rockbruno d4a2e6b
Cleaning docs
rockbruno 62ee6f2
Change implementation and name
rakaramos f674851
DocDocs
rockbruno 5bdea96
Merge remote-tracking branch 'origin/fix-algo' into docdocs
rockbruno dd15b5a
Docs
rockbruno 7ac3915
Merge pull request #3 from rakaramos/docdocs
rockbruno c68537f
Docs
rockbruno e8504fd
Optimize
rockbruno 36e9a39
Fix header and remove assert
rakaramos 1d22ef9
Add more tests (#4)
rakaramos 62096e1
Update PartialSortTests.swift
rockbruno d0c1ccd
Merge pull request #5 from rakaramos/rockbruno-patch-1
rockbruno 23bf863
Update Sources/Algorithms/PartialSort.swift
rockbruno 379609b
Update Sources/Algorithms/PartialSort.swift
rockbruno 435a38c
Update Sources/Algorithms/PartialSort.swift
rockbruno 70973a2
Documentation fixes
rockbruno 70a263c
Add tests for massive inputs
rockbruno 1d3dcaf
isLastElement
rockbruno File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Sorted Prefix | ||
|
||
[[Source](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/PartialSort.swift) | | ||
[Tests](https://github.com/apple/swift-algorithms/blob/main/Tests/SwiftAlgorithmsTests/PartialSortTests.swift)] | ||
|
||
Returns the first k elements of this collection when it's sorted. | ||
|
||
If you need to sort a collection but only need access to a prefix of its elements, using this method can give you a performance boost over sorting the entire collection. The order of equal elements is guaranteed to be preserved. | ||
|
||
```swift | ||
let numbers = [7,1,6,2,8,3,9] | ||
let smallestThree = numbers.sortedPrefix(3, by: <) | ||
// [1, 2, 3] | ||
``` | ||
|
||
## Detailed Design | ||
|
||
This adds the `Collection` method shown below: | ||
|
||
```swift | ||
extension Collection { | ||
public func sortedPrefix(_ count: Int, by areInIncreasingOrder: (Element, Element) throws -> Bool) rethrows -> [Element] | ||
} | ||
``` | ||
|
||
Additionally, a version of this method for `Comparable` types is also provided: | ||
|
||
```swift | ||
extension Collection where Element: Comparable { | ||
public func sortedPrefix(_ count: Int) -> [Element] | ||
} | ||
``` | ||
|
||
### Complexity | ||
|
||
The algorithm used is based on [Soroush Khanlou's research on this matter](https://khanlou.com/2018/12/analyzing-complexity/). The total complexity is `O(k log k + nk)`, which will result in a runtime close to `O(n)` if k is a small amount. If k is a large amount (more than 10% of the collection), we fall back to sorting the entire array. Realistically, this means the worst case is actually `O(n log n)`. | ||
|
||
Here are some benchmarks we made that demonstrates how this implementation (SmallestM) behaves when k increases (before implementing the fallback): | ||
|
||
 | ||
 | ||
|
||
### Comparison with other languages | ||
|
||
**C++:** The `<algorithm>` library defines a `partial_sort` function where the entire array is returned using a partial heap sort. | ||
|
||
**Python:** Defines a `heapq` priority queue that can be used to manually achieve the same result. | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
//===----------------------------------------------------------------------===// | ||
// | ||
// This source file is part of the Swift Algorithms open source project | ||
// | ||
// Copyright (c) 2020 Apple Inc. and the Swift project authors | ||
// Licensed under Apache License v2.0 with Runtime Library Exception | ||
// | ||
// See https://swift.org/LICENSE.txt for license information | ||
// | ||
//===----------------------------------------------------------------------===// | ||
|
||
extension Collection { | ||
/// Returns the first k elements of this collection when it's sorted using | ||
/// the given predicate as the comparison between elements. | ||
/// | ||
/// This example partially sorts an array of integers to retrieve its three | ||
/// smallest values: | ||
/// | ||
/// let numbers = [7,1,6,2,8,3,9] | ||
/// let smallestThree = numbers.sortedPrefix(3, by: <) | ||
/// // [1, 2, 3] | ||
/// | ||
/// If you need to sort a collection but only need access to a prefix of its | ||
/// elements, using this method can give you a performance boost over sorting | ||
/// the entire collection. The order of equal elements is guaranteed to be | ||
/// preserved. | ||
/// | ||
/// - Parameter count: The k number of elements to prefix. | ||
/// - Parameter areInIncreasingOrder: A predicate that returns true if its | ||
/// first argument should be ordered before its second argument; | ||
/// otherwise, false. | ||
/// | ||
/// - Complexity: O(k log k + nk) | ||
public func sortedPrefix( | ||
_ count: Int, | ||
by areInIncreasingOrder: (Element, Element) throws -> Bool | ||
) rethrows -> [Self.Element] { | ||
assert(count >= 0, """ | ||
Cannot prefix with a negative amount of elements! | ||
""" | ||
) | ||
|
||
// Do nothing if we're prefixing nothing. | ||
guard count > 0 else { | ||
return [] | ||
} | ||
|
||
// Make sure we are within bounds. | ||
let prefixCount = Swift.min(count, self.count) | ||
|
||
// If we're attempting to prefix more than 10% of the collection, it's | ||
// faster to sort everything. | ||
guard prefixCount < (self.count / 10) else { | ||
return Array(try sorted(by: areInIncreasingOrder).prefix(prefixCount)) | ||
} | ||
|
||
var result = try self.prefix(prefixCount).sorted(by: areInIncreasingOrder) | ||
for e in self.dropFirst(prefixCount) { | ||
if let last = result.last, try areInIncreasingOrder(last, e) { | ||
continue | ||
} | ||
let insertionIndex = | ||
try result.partitioningIndex { try areInIncreasingOrder(e, $0) } | ||
let isLastElement = insertionIndex == result.endIndex | ||
result.removeLast() | ||
if isLastElement { | ||
result.append(e) | ||
} else { | ||
result.insert(e, at: insertionIndex) | ||
} | ||
} | ||
|
||
return result | ||
} | ||
} | ||
|
||
extension Collection where Element: Comparable { | ||
/// Returns the first k elements of this collection when it's sorted in | ||
/// ascending order. | ||
/// | ||
/// This example partially sorts an array of integers to retrieve its three | ||
/// smallest values: | ||
/// | ||
/// let numbers = [7,1,6,2,8,3,9] | ||
/// let smallestThree = numbers.sortedPrefix(3) | ||
/// // [1, 2, 3] | ||
/// | ||
/// If you need to sort a collection but only need access to a prefix of its | ||
/// elements, using this method can give you a performance boost over sorting | ||
/// the entire collection. The order of equal elements is guaranteed to be | ||
/// preserved. | ||
/// | ||
/// - Parameter count: The k number of elements to prefix. | ||
/// | ||
/// - Complexity: O(k log k + nk) | ||
public func sortedPrefix(_ count: Int) -> [Element] { | ||
return sortedPrefix(count, by: <) | ||
} | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's still a logic issue here — if
e
is equal toresult.last
, execution will pass by thiscontinue
statement. That's a problem, because the call topartitioningIndex
then returnsendIndex
, which becomes invalid after the call toresult.removeLast()
. What you want to ensure is thate
is strictly less thanresult.last
before proceeding.Test case:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crap... That's another case that we did have covered in the tests, but the prefix wasn't low enough to trigger the algorithm. I added more high input cases, hopefully it will work now.