-
Notifications
You must be signed in to change notification settings - Fork 10.5k
SILOptimizer: try inplace_function for compile performance (Test) #42208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…y allocation overhead.
47abc69
to
bfe32b0
Compare
I've run some tests locally and seem to be getting roughly a 2-4% speedup in the full build time on the project where I noticed the compile time issues; e.g.
@eeckstein or @gottesmm (as the last people to modify these files): if you think it's worth running a full compiler performance test on CI then I'd appreciate you starting that; otherwise, given these are fairly modest speedups, I'm also okay with closing this PR. |
nice work! |
Thank you! Although this overhead is not surprising, it's unacceptable--we need a fix. I really am surprised that std::function doesn't have any inline storage optimization though! llvm::unique_function does. Can you just use that instead? I don't know if there's any possibility of including a header under a different software license. AFAIK we need to keep everything in the Swift repo under the same license. You'd need to check on the forum. Just to make sure I understand, is the 5% overhead mostly from std::function allocation within UpdatingInstructionIterator here:
For the record, the InstructionDeleter could be fixed to directly call iteratorRegistry.notifyDelete and notifyNew. But that's unsafe within any SIL passes that still rely on InstModCallbacks. To do that safely, we would need to eliminate InstModCallbacks complete, which still requires fixing a handful of SIL passes. |
I’ve looked into this a little more and now think that the timing improvements from before may have been benchmark noise. The ~5% impact was measured on the 5.6 branch and was mainly in In terms of the timing from before: I ran the build a few more times with everything else closed on my machine, and the results were: Without these changes:
With these changes:
That seems close enough to be in the margin of error, although I do find it interesting that the user time and CPU usage were consistently slightly lower than without the changes in this PR. Either that's just benchmark noise or is a tiny real improvement caused by some other factors (alignment changes?), but either way I don't feel it's enough to justify this PR. To quickly address one other point:
My understanding (from https://stackoverflow.com/a/46163732 and its comments) is Anyway, thank you for having fixed the underlying issue back in #41239 and I hope looking at this didn't take up too much of your time. |
Thanks for doing these experiments! |
@eeckstein fixed a severe problem where these closures were allocating unnecessarily sometime in the 5.7 time frame. It should be in that file's git log. I suspect that's what you were running into. Thanks for checking! |
I was profiling some slow compiles and saw that the std::functions in UpdatingListIterator seems to be incurring a lot of new/delete overhead (around 5% of the compile time for a -O -wmo build), so was curious as to how much moving the out-of-line std::function allocations inline would affect performance.
I haven't been able to test performance on code locally due to difficulties in building the SwiftPM part of a toolchain (although the Swift tests pass), and I don't expect this to be merged as-is (since I’m guessing the inplace_function dependency might be an issue). It'd be great if someone could trigger a CI compiler performance test to see if this is worth pursuing; otherwise I can try to fix the toolchain build errors and get some local numbers sometime later this week.
inplace_function.h was taken from https://github.com/WG21-SG14/SG14/blob/master/SG14/inplace_function.h and hackily modified to assert on exceptions.