You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is the first of two major pass pipeline design changes that I have been wanting to make to fix performance instability.
This PR fixes the pipeline restart mechanism by rearranging passes to respect the function pass pipeline.
The second PR will fix handling of @semantic calls in the inliner so their inlining is deferred until late in the pipeline and they are inlined in a predictable order so that passes that optimize @semantic calls are guaranteed to see them.
The higher-level goal of both these PRs is to allow for incremental progress on improvements to the pass pipeline and inlining without being backed into a corner, where current benchmarks perform well only by chance. For this to happen, our design needs to strictly adhere to some principles:
Pass pipelines are organized according to clear rules and
principles. Any other dependencies between passes are explicitly
called out. The major phases should be
Optimization pipeline for unoptimized (Onone) builds
Optimization pipeline required before serialization
Optimization pipeline that handles @semantic and @effects functions
(the most powerful optimizations need to go here)
Optimization pipeline that further reduces the inlined implementation
of (non-nested) @semantic and @effects functions
(this last phase has less information from analyses)
(ultimiately this last phase will be split between OSSA/non-OSSA pipelines)
A single decision to inline can only expose more
optimization/analyses opportunities (assuming it doesn't block
further inlinining because of code size)
Inlining cannot prevent an optimization pass from processing a given
region of code.
A @semantic call site will always be processed by the optimization
pass that operates on those semantics regardless of inlining
decisions.
This PR is only one small step toward these principles.
This PR currently exposes a bug where SimplifyCFG does not update the dominator tree. This will be hit be the unit test:
SILOptimizer/specialize_opaque_type_archetypes.swift
Module passes need to be in a separate pipeline, otherwise the
pipeline restart mechanism will be broken.
This makes GlobalOpt and serialization run earlier in the
pipeline. There's no explicit reason for them to be run later, in the
middle of a function pass pipeline.
Also, pipeline boundaries, like serialization and module passes should
be explicit at the the top level function that creates the pass
pipelines.
Don't allow module passes to be inserted within a function pass
pipeline. This silently breaks the function pipeline both interfering
with analysis and the normal pipeline restart mechanism.
How to read the data
The tables contain differences in performance which are larger than 8% and
differences in code size which are larger than 1%.
If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.
Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).
Hardware Overview
Model Name: Mac Pro
Model Identifier: MacPro6,1
Processor Name: 12-Core Intel Xeon E5
Processor Speed: 2.7 GHz
Number of Processors: 1
Total Number of Cores: 12
L2 Cache (per Core): 256 KB
L3 Cache: 30 MB
Memory: 64 GB
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is the first of two major pass pipeline design changes that I have been wanting to make to fix performance instability.
This PR fixes the pipeline restart mechanism by rearranging passes to respect the function pass pipeline.
The second PR will fix handling of @semantic calls in the inliner so their inlining is deferred until late in the pipeline and they are inlined in a predictable order so that passes that optimize @semantic calls are guaranteed to see them.
The higher-level goal of both these PRs is to allow for incremental progress on improvements to the pass pipeline and inlining without being backed into a corner, where current benchmarks perform well only by chance. For this to happen, our design needs to strictly adhere to some principles:
Pass pipelines are organized according to clear rules and
principles. Any other dependencies between passes are explicitly
called out. The major phases should be
(the most powerful optimizations need to go here)
of (non-nested) @semantic and @effects functions
(this last phase has less information from analyses)
(ultimiately this last phase will be split between OSSA/non-OSSA pipelines)
A single decision to inline can only expose more
optimization/analyses opportunities (assuming it doesn't block
further inlinining because of code size)
Inlining cannot prevent an optimization pass from processing a given
region of code.
A @semantic call site will always be processed by the optimization
pass that operates on those semantics regardless of inlining
decisions.
This PR is only one small step toward these principles.
This PR currently exposes a bug where SimplifyCFG does not update the dominator tree. This will be hit be the unit test:
SILOptimizer/specialize_opaque_type_archetypes.swift