-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Fix the mid-level pass pipeline #22445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci test. |
Build failed |
Build failed |
This PR is on hold because there are a few benchmark regressions I don't have time to investigate today. The vast majority are improvements, but I should still understand why the few regressions happened before merging. |
This PR is still active. I plan to work on merging it this week after finishing some work on EscapeAnalysis. Unfortunately, I can't avoid all regressions because of some fundamental problems in ARC optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@swift-ci test |
@swift-ci benchmark |
@swift-ci test source compatibility |
@swift-ci test compiler performance |
Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
Code size: -swiftlibs
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
Build failed |
Build failed |
@swift-ci test compiler performance |
1 similar comment
@swift-ci test compiler performance |
Summary for master fullUnexpected test results, excluded stats for RxCocoa, SwifterSwift, Base64CoderSwiftUI Regressions found (see below) Debug-batchdebug-batch briefRegressed (0)
Improved (0)
Unchanged (delta < 1.0% or delta < 100.0ms) (3)
debug-batch detailedRegressed (0)
Improved (4)
Unchanged (delta < 1.0% or delta < 100.0ms) (171)
Releaserelease briefRegressed (3)
Improved (0)
Unchanged (delta < 1.0% or delta < 100.0ms) (0)
release detailedRegressed (6)
Improved (0)
Unchanged (delta < 1.0% or delta < 100.0ms) (13)
|
This way we catch the DeadFunctionArgs optimization too, which is a separate sub-pass.
And fix a test case that assumes Array.append is never inlined. We need to be able to prevent inlining a function and any of its specializations. There's no way to specify multiple function names on the command line, so we use the partial match technique.
Inlining semantic calls must *always* be deferred to the mid-level pipeline. Otherwise, array optimizations take effect randomly. This is a primary cause of instability in Swift benchmarks.
Module passes need to be in a separate pipeline, otherwise the pipeline restart mechanism will be broken. This makes GlobalOpt and serialization run earlier in the pipeline. There's no explicit reason for them to be run later, in the middle of a function pass pipeline. Also, pipeline boundaries, like serialization and module passes should be explicit at the the top level function that creates the pass pipelines.
inlining. This tests that the pass pipeline is properly setup to run all the array optimizations even when inlining is required. This requires that the passmanager defer all semantic inlining until it reaches the mid-level pipeline.
So -debug-only=sil-inliner is useful.
This complements the pass pipeline changes. Now no semantic inlining occurs in the early phase, and we expect to see nested semantic calls in the mid-level pipeline. To ensure that optimizations based on call semantics work, we need those nested semantic calls to be inlined.
So they can be used in ADTs like SmallPtrSet.
So that it's possible to determine the kind of Array semantic function from its SILFunction decl only.
Don't allow module passes to be inserted within a function pass pipeline. This silently breaks the function pipeline both interfering with analysis and the normal pipeline restart mechanism.
Devirtualizing try_apply modified the CFG, but passes that run devirtualization were not invalidating any CFG analyses, such as the domtree. This could hypothetically have caused miscompilation, but will be caught by running with -sil-verify-all.
Run all high-level opts before inlining any semantic calls. We have one exception: an array.uninitialized compiler intrinsic that needs to be expanded early for optimizations to see the "normal" underlying semantic call. This not only ensures that stdlib is well optimized, but ensures that callers are fully optimized and inlined before attempting to run semantic passes. Lower nested semantic calls incrementally so that semantic passes rerun at each level. Avoid fully expanding any semantic functions until all semantic passes have seen them and all callers have been fully optimized. Inlining the lowest-level semantic calls defeats basic EscapeAnalysis and SideEffectAnalysis. This shouldn't be done until the late inliner runs.
Interposing non-semantic calls within semantic calls breaks the mechanism by which the pass pipeline incrementally processes each layer of semantics. Semantics should go "all the way down" to the level that we want inlined in the mid-level pipeline.
Which broke as a result of stdlib refactoring, and which I don't have time to get working again.
Not all simplifications correctly invalidate CFG analyses.
Superseded by #31163 and other followup PRs. |
This is a small series of commits that together fixes issues in the pass pipeline that are causing severe instability in benchmark performance and blocking other bug fixes.
We have heroic optimizations for array access, but whether those actually kick in was essentially random. Now, those optimizations always have a chance to run and the pass pipeline consistently reruns when necessary as it was designed to do.
The two main bug fixes are