-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[semantic-arc] Optimize more lifetime joining when a copy_value, destroy_value are in the same block. #34844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[semantic-arc] Optimize more lifetime joining when a copy_value, destroy_value are in the same block. #34844
Conversation
@swift-ci smoke test |
This is going to let me remove the infamous *AndFold functions from TypeLowering. |
7e47c9f
to
2599a49
Compare
@swift-ci test |
@swift-ci benchmark |
@swift-ci test source compatibility |
Performance: -O
Code size: -OPerformance: -Osize
Code size: -OsizePerformance: -Onone
Code size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
Build failed |
@swift-ci clean test linux platform |
Took a look at the source compat failure. Found something interesting. The error has to do with the following situation:
The current patch would produce this:
This creates a problem since at %123, the lifetime of %8 is over, but %15 is still depending on it. So what I need to do is if the consuming use (in this case br bb15) is in the same block, walk the block and replace all of the references with a new borrow. I imagine at -Onone we don't want to do more than just walk the block. |
…pecific opts and use that to add new split up semantic-arc tests. This split will ensure we are testing only what a specific optimization is doing rather than all together. NOTE: The way I split up the tests is I first split up each of the tests by subject area by hand that I thought were specifically testing one pass. Then any tests where the FileCheck tests started to fail due to us not running the other passes, I put back a copy in the original semantic-arc-opts.sil to ensure we do not regress. Note, the tests that I copied for each of these passes are originally from semantic-arc-opts.sil. I left them there as well so we could see all of the opts together. I also only copied the ones that were testing pass specific functionality.
…se}s(Operand *) This originally trafficked in Instructions and for some reason the name was never changed. I also changed the result type to be a bool and added the ability for the passed in closure to signal failure (and iteration stop) by returning false. This also makes it possible to use visitLocalEndScopeUses in if statements which can be useful.
…roy_value are in the same block. Specifically the optimization that is being performed here is the elimination of lifetimes that hand off ownership in between two regions of code. Example: ``` %1 = copy_value %0 ... destroy_value %0 ... apply %consumingUser(%1) ``` We really want to handle this case since it is a natural thing that comes up in programs and will let me eliminate the *evil* usage of emitDestroyValueOperation in TypeLowering without needing to modify huge amounts of tests.
2599a49
to
85965ed
Compare
@swift-ci test |
@swift-ci test source compatibility |
All of the failures in the source combat suite are due to float16 issues, not this PR (unlike before). |
@atrick I am just committing some small peepholes so I can eliminate the and fold stuff. I don't want to invest the time in copy prop right now. |
Re testing. |
@swift-ci test |
Build failed |
@swift-ci smoke test OS X platform |
Specifically the optimization that is being performed here is the elimination of
lifetimes that hand off ownership in between two regions of code. Example:
We really want to handle this case since it is a natural thing that comes up in
programs. That being said when I originally implemented this optimization, I
started with the assumption that the copy_value, destroy_value had to be in
different blocks and for simplification that:
The operand of the copy_value had only one destroy: the destroy_value that we
are trying to eliminate.
The copy_value only had one consuming user, the consuminer user that we are
tracking.
This isn't much to work with unless one wants to do a heavier weight analysis.
So since this was meant to be a cheap optimization, I realized quickly that
beyond checking if the consuming user is a terminator/is in a terminator block,
I couldn't really do much more than optimize based off of the assumption that
the copy_value, destroy_value, and consumingUser were in the same block.
With that in mind, I am trying to eliminate now the emit*AndFold operations from
type lowering. It is surprising that APIs like type lowering/SILBuilder can
delete instructions. It is even worse and can introduce memory unsafety in the
compiler if the caller code needs to destroy instructions itself after updating
state. Thus I had a need to teach the slimmed down semantic-arc-opts that we run
at -Onone how to do that really simple optimization.
After some thought, I realized that all of the "emit*AndFold" operations are
only looking earlier in the block for the obvious pattern and that pattern is
the same thing I was doing in the previous implementation once I had gotten to
the point where I could only do anything interesting with the copy_value,
destroy_value in the same block.
So I split the previous "is this optimization safe" into two variants:
A variant that does the terminator check that I mentioned previously and only
does so if the copy_value has a single consuming use and the destroy_value is
its operand's only consuming use.
A variant that allows for our copy_value operand to have multiple
destroy_value, but we require that the destroy_value and the copy_value we
want to eliminate are in the same block and that the copy_value has a single
consuming use and it is in the same block as the copy_value, destroy_value.
Then I changed the optimization to first check if a destroy_value is the only
consuming use of the copy's operand. In that case, we perform (1) and optimize
if it is safe.
If we didn't succeed while performing that optimization, then we visit all
destroy_value uses of the operand of the copy_value. Then if any of those
destroy_value are in the same block as the copy_value we want to optimize then:
a. If the copy_value had multiple consuming insts, then we know that those
destroy_value can not be in the same block as copy_value since we would have
a double lifetime ending use. This means that we can immediately eliminate
the copy_value, destory_value without further work.
b. If the copy_value had only a single consuming inst and the consuming inst is
not in the same block as the copy_value then we know it is after the
destroy_value, so we can optimize without further work.
c. Otherwise, we know that our copy_value has a single consuming inst and that
it is in the same block as the copy_value, destroy_value. In that case, we
perform the safety check above from (2).
Thus we are able to keep doing the same optimizations as before but also handle
in a much stronger way simple hand-off copies.