-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[stdlib] Modify sort to pivot on median of 3 #6382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci Please smoke test |
@swift-ci Please benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neato!
} | ||
} | ||
|
||
/// Reorders `elements` and returns an index `p` such that every element in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: do you mean "returns an index pivot
"?
// * elements[i] < pivot, for i in range.lowerBound..<lo | ||
// * pivot <= elements[i] for i in hi..<range.upperBound | ||
Loop: while true { | ||
FindLo: repeat { | ||
while lo != hi { | ||
if !${cmp("elements[lo]", "pivot", p)} { break FindLo } | ||
elements.formIndex(after: &lo) | ||
} | ||
break Loop | ||
} while false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I guess it's not really something you changed, but why repeat { } while false
instead of just do { }
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this code predates introduction of do
-catch
—also, I didn't realize you could break from a do
block. I've switched that, thanks!
Build comment file:Optimized (O) Regression (5)
Improvement (5)
No Changes (137)
Regression (3)
Improvement (3)
No Changes (141)
|
faef8a9
to
a445cb9
Compare
@swift-ci Please smoke test |
@swift-ci Please benchmark |
1 similar comment
@swift-ci Please benchmark |
@natecook1000 Can you add a benchmark that shows the bad perf? (Just so we can make sure we don't fall back into this). |
Build comment file:Optimized (O) Regression (6)
Improvement (4)
No Changes (137)
Regression (4)
Improvement (3)
No Changes (140)
|
@gottesmm Will do! I don't think it wasn't intended to test sorting performance, but the StaticArray test includes a sort of a reverse ordered "array", so you can see the performance increase there: ~3x unoptimized and ~5.5x optimized. |
Build comment file:Optimized (O) Regression (6)
Improvement (7)
No Changes (138)
Regression (5)
Improvement (4)
No Changes (142)
|
@gottesmm Test added—definitely shows a difference! (master):swift-macosx-x86_64$ /bin/Benchmark_O SortSortedStrings --num-iters=50 --num-samples=10
#,TEST,SAMPLES,MIN(us),MAX(us),MEAN(us),SD(us),MEDIAN(us)
126,SortSortedStrings,10,4557,4926,4708,96,4716
(nc-sort-median):swift-macosx-x86_64$ /bin/Benchmark_O SortSortedStrings --num-iters=50 --num-samples=10
#,TEST,SAMPLES,MIN(us),MAX(us),MEAN(us),SD(us),MEDIAN(us)
126,SortSortedStrings,20,776,909,809,45,788 |
@swift-ci Please benchmark |
Build comment file:Optimized (O) |
This modifies the _partition function used during sorting to select the median of the first, middle, and last elements in the range to be partitioned. Before partitioning begins, those three elements are sorted, after which the middle element is selected as the pivot. This change improves performance for sorted or nearly-sorted data.
a9d31c9
to
890bb27
Compare
@swift-ci Please smoke test |
@swift-ci Please benchmark |
Build comment file:Optimized (O) |
@natecook1000 Thanks for the benchmark! |
@gottesmm No problem! Seems like the new benchmark is tripping up swift-ci, though. Should I merge that through a separate PR and try again, or do you think we're okay with the performance now? |
890bb27
to
7303125
Compare
@swift-ci Please benchmark |
Build comment file:Optimized (O) Regression (5)
Improvement (8)
No Changes (139)
Regression (4)
Improvement (7)
No Changes (141)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really glad you've done this, Nate! Looks like good work; I just have a few questions.
${"" if p else ", C.Iterator.Element : Comparable"} | ||
{ | ||
// Possible starting states for elements at a, b, and c: | ||
// 123, 132, 213, 231, 312, 321, 112, 121, 211, 122, 212, 221, 111 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't you going to show me how a, b, and c relate to this collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some more comments here and above—pls let me know if that's what you were looking for.
_ elements: inout C, | ||
_ a: C.Index, _ b: C.Index, _ c: C.Index | ||
${", by areInIncreasingOrder: (C.Iterator.Element, C.Iterator.Element) -> Bool" if p else ""} | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this declared in an extension to MutableCollection & RandomAccessCollection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we finally route everything through one sort implementation that takes a closure, rather than gyb'ing, without regressing performance? A weak optimizer that may have since been fixed is the only reason we're gyb'ing this code. @bob-wilson what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm following the rest of the functions in the file, here. It would be great to have all of these as extension methods and to get rid of the gyb—I can try those changes out after we get this merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, getting rid of gyb and using extension methods are totally orthogonal concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the history of this or why you had to gyb the code, but regardless, if the optimizer has improved to let you get rid of that without regressing performance, that would be great.
let pivot = elements[range.lowerBound] | ||
// Sort the first, middle, and last elements, then use the middle value | ||
// as the pivot for the partition. | ||
let half = elements.distance(from: lo, to: hi) / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is a signed integers that is always positive we probably either want to convert it to unsigned or use a right-shift to divide by 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm—IndexDistance
is a SignedInteger
, which doesn't have bit shifts, and there's no easy unsigned conversion. Is this better than the above?
let half = numericCast(elements.distance(from: lo, to: hi)) / 2 as UInt
let mid = elements.index(lo, offsetBy: numericCast(half))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we think we need to support sorting of huge files on 32-bit platforms, it might be. This will be awkward until we get the new integer protocols, so maybe you should insert a FIXME to use bitshifts once the new protocols come online. By the way, though, let's put the explicit type coercion next to the numericCast, OK?
let half = numericCast(elements.distance(from: lo, to: hi)) as UInt / 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks!
elements.formIndex(after: &lo) | ||
while lo != hi { | ||
if !${cmp("elements[lo]", "pivot", p)} { break FindLo } | ||
elements.formIndex(after: &lo) | ||
} | ||
break Loop | ||
} while false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good transformation, but do we even need the do { ... } scope? It looks like we could dump it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need labeled scopes that break FindLo
and break FindHi
can break out of. The refactored versions I can come up with end up duplicating the lo != hi
bounds checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I overlooked the inner breaks. No problem here, then.
@@ -190,7 +237,7 @@ func _introSortImpl<C>( | |||
depthLimit: depthLimit &- 1) | |||
_introSortImpl( | |||
&elements, | |||
subRange: (elements.index(after: partIdx))..<range.upperBound, | |||
subRange: partIdx..<range.upperBound, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this an unintentional change? If not, why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an intentional change. Because the previous implementation of _partition
slotted the pivot element into partIdx
as its final step, the elements were always arranged such that everything in range.lowerBound..<partIndex
was less than elements[partIdx]
and in elements.index(after: partIdx)..<range.upperBound
was greater than or equal to elements[partIdx]
.
The new version doesn't have that guarantee, only that everything in range.lowerBound..<partIdx
is less than everything in partIdx..<range.upperBound
, so the sort of the upper partition needs to include the partitioning index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM; thanks for explaining.
@@ -262,6 +310,7 @@ func _heapify<C>( | |||
${", by: areInIncreasingOrder" if p else ""}) | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that these are particularly bad, but we try to avoid whitespace-only changes, for the record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted, thanks!
d00ffd5
to
7303125
Compare
I guess this is the only thing I feel should be fixed: #6382 (comment) |
@@ -86,7 +90,8 @@ func _sort3<C>( | |||
C : MutableCollection & RandomAccessCollection | |||
${"" if p else ", C.Iterator.Element : Comparable"} | |||
{ | |||
// Possible starting states for elements at a, b, and c: | |||
// Possible starting orderings for elements at `elements[a]`, `elements[b]`, | |||
// and `elements[c]`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change just made the text longer without addressing my point, which I obviously failed to articulate well. What I don't see is where a, b, and c fall in the sequence below. For example, if you wrote:
// 123, 132, 213, 231, 312, 321, 112, 121, 211, 122, 212, 221, 111
// ^ ^ ^
// a b c
I would understand the scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I completely get it—I don't know why I made that list look so much like a collection! I've updated those comments; please let me know if those make more sense. I'm trying to make the flow of the different cases clear, since there's some subtlety due to keeping the sort stable.
f2d339a
to
b582683
Compare
@swift-ci Please smoke test |
This modifies the internal
_partition
function used during sorting to select a better candidate as the pivot element. Currently, the first element of the range is always selected, leading to poor performance in cases of sorted or mostly sorted data. Benchmarks from this script for the current stdlib sort:This revised partition function sorts the first, middle and last elements in the range, and then uses the new middle element as the pivot. Choosing the pivot in this way eliminates the poor performance (except for the QuickSort killer ¯\_(ツ)_/¯):