-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Code Size Reduction: Pass large loadable types by address instead of by value #8909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
nice work! Some general comments: |
Thanks @eeckstein I just started a new (clean) full test and smoke test. The previous one failed to build, either due to a new change in master or the incremental build process - will investigate tomorrow... |
I wonder if long-term we want to do something about these "IRGen SIL passes". Perhaps if SIL type lowering could calculate the sizes of things, they wouldn't need IRGen to be around and we could simplify things by keeping such indirect types as address-only all the way through the SIL pipeline. It doesn't seem like a big stretch to refactor IRGen type lowering to not depend on most of the rest of IRGen, and move it into a separate library that SIL can use. But you're also working on severely curtailing SILs use of address only types, so you'll probably have better ideas. |
@slavapestov This would require introducing the notion of target platform into SIL. The decision whether to be kept address only is currently tied to whether we will pass this type indirectly as per SwiftCC ABI. This decision is currently intimately tied to how the IRGen explosion schema (llvm types) is lowered to the SwiftCC (https://github.com/apple/swift-clang/blob/stable/include/clang/CodeGen/SwiftCallingConv.h, https://github.com/apple/swift/blob/master/lib/IRGen/GenCall.cpp#L299). You could model this earlier but that would require platform knowledge of type sizes before IRGen lowering which has been mostly avoided as a design goal (beyond the std library knowing the size of Int ...) [my interpretation, John or Joe please correct me if this interpretation is wrong] There are valid reason why SIL optimization passes might want to convert by-address values into SSA values (SROA, value propagation, all the things that an SSA optimizer likes to do). One could say that the optimizer should do those only on demand but the analysis of whether this is profitable is better expressed on SSA ... (the same reason we are moving to address only types as SSA until the end of the pipeline) |
@@ -54,7 +54,7 @@ const uint16_t VERSION_MAJOR = 0; | |||
/// in source control, you should also update the comment to briefly | |||
/// describe what change you made. The content of this comment isn't important; | |||
/// it just ensures a conflict if two people change the module format. | |||
const uint16_t VERSION_MINOR = 335; // Last change: no type in objc method table | |||
const uint16_t VERSION_MINOR = 336; // Last change: no type in objc method table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick flyby - also change the comment
@slavapestov It makes sense to me to have a transitional phase where we prepare SIL for lowering to IR, since the alternatives are either to expose more low-level concerns in SIL or to make IRGen more complex. Most of the optimizer team's work this year has been toward making normal SIL higher-level, if anything. |
@gottesmm and I have talked idly about adding a "lowered" SIL phase in addition to "raw" and "canonical". |
ebd9688
to
521087f
Compare
2529296
to
9e8cf66
Compare
535c503
to
fe0fd57
Compare
@slavapestov @jrose-apple The "lowered" SIL stage was the plan of record last year. It's in tree as of my commit in January. |
Incidentally, a "lowered" SIL stage is much more problematic than raw/canonical stages because the representation of address-only is incompatible in both directions (in addition to only one side having platform knowledge). |
5139657
to
4b1e7b6
Compare
4b1e7b6
to
d7a7530
Compare
@swift-ci Please smoke test |
@swift-ci Please test |
@swift-ci Please Test Source Compatibility |
1 similar comment
@swift-ci Please Test Source Compatibility |
Merged newer PR - closing this one |
radar rdar://problem/28680453
Reduce the size of the text area (code size of the compiled output) by changing the way we generate LLVM IR for loadable types:
We do a SIL level transformation, during IRGen, that, from a bird's eye view, does the following:
For every every function that takes as an input a big type, re-write it to take a pointer to a big type instead.
: $@convention(method) (BigStruct)
turns into$@convention(method) (@in_constant BigStruct)
There are a few new SIL pieces to support this:
@in_constant
calling conventionretain_value_addr
andrelease_value_addr
SIL instructions that take as an input an address, load the value inside it and callretain_value
andrelease_value
respectivelyPlease note:
copy_addr
instruction) - reusing the value-witness function instead is less than ideal: it (needlessly) takes as a 3rd parameter %swift.type + takes all the parameters as i8 - requiring a bitcast at the caller (v.s. at the callee in my outlined function)alloc_stack
instructions and, in some cases,load
of the function arguments to make this work. The current implementation, while it has some peephole optimizations for the most painful cases, is still very conservative: we can update this in the future, further reducing the code size$BigType
, and now takes$*BigType
, there are cases wherein we create newalloc_stack
andstore
instructions in order to call said function. Just like the point made above - the current implementation is very conservative / can be further optimized. this will be done in phasesThe radar mentioned in this PR contains a test program called BigTypeProg: This program contains a big struct with some loadable types + a class / need for retain and release. Passes it around to different functions that access the struct.
Under -Onone we have 3X binary size reduction, under -O a 2X binary size reduction. The text area is reduced by around 80% and 60% respectively.
Last but not least: this PR is still a work-in-progress. There aren't nearly enough test-cases either for this huge change nor for the newly added SIL instructions. There are also some minor code cleanups / refactoring that will be done to the new SIL-level transformation. The PR will be updated in the next couple of days to remedy that. I decided to open this PR now to allow people enough time to review it. It is in an almost-final condition and passes all the smoke tests on my machine. @aschwaighofer , @rjmccall , @jckarter : Can you please start taking a look?