-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[WIP][stdlib] Minor Sequence Optimizations #20758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Rotate the ringBuffer in-place instead of copying to new array for lower memory footprint.
Nano-optimization: Following the template used in methods `map` and `_filter`, internally use ContiguousArray before converting to Array on return.
When measuring locally, I saw no performance difference. But I don't think our benchmarks, slicing only short sequences of @airspeedswift Can this be accepted only on theoretical memory-saving grounds? Or do we need to devise better benchmark that showcases the tradeoffs between in-place rotation and array copy? |
@swift-ci please benchmark |
@swift-ci please smoke test |
@airspeedswift Please review 🙏 |
This comment has been minimized.
This comment has been minimized.
Hmm… I don't recall seeing any regressions with SuffixSequence when I ran this locally 2 weeks ago. 🤨 |
@airspeedswift Any idea why that ^^ might be? |
Let's try again… |
Build comment file:Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
I cannot reproduce the regression on |
@swift-ci please benchmark |
This comment has been minimized.
This comment has been minimized.
Now just the in-place Array rotation... |
This comment has been minimized.
This comment has been minimized.
OK, so the in-place rotation is causing the de-optimization in |
6d943cc
to
46b462f
Compare
46b462f
to
fa15cb2
Compare
Build comment file:Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
@airspeedswift I think I have this down now… can you please have a look, because in order to work around the weird de-optimization in |
@usableFromInline | ||
mutating func _rotate(left distance: Int) { | ||
guard distance > 0 else { return } | ||
let i = index(startIndex, offsetBy: distance) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not necessarily the best way to rotate, and is probably not the API we want to commit to supporting even internally. Please try to use what's already written in the algorithms prototype, and especially read this comment and test accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's better than to split this and do just the consistent use of ContiguousArray
in this PR, and tackle the in-place rotation in a separate PR, right?
Following up on discussion from #20221 (review), these are minor tweaks to
Sequence
implementation:ringBuffer
used insuffix
, avoiding new array copy.ContiguousArray
indropLast
,prefix
andsuffix
(modeled aftermap
and_filter
implementations).