[WIP] Fix the mid-level function-pass pipeline #31163

atrick · 2020-04-20T22:55:17Z

This is the first of two major pass pipeline design changes that I have been wanting to make to fix performance instability.

This PR fixes the pipeline restart mechanism by rearranging passes to respect the function pass pipeline.

The second PR will fix handling of @semantic calls in the inliner so their inlining is deferred until late in the pipeline and they are inlined in a predictable order so that passes that optimize @semantic calls are guaranteed to see them.

The higher-level goal of both these PRs is to allow for incremental progress on improvements to the pass pipeline and inlining without being backed into a corner, where current benchmarks perform well only by chance. For this to happen, our design needs to strictly adhere to some principles:

Pass pipelines are organized according to clear rules and
principles. Any other dependencies between passes are explicitly
called out. The major phases should be
- Optimization pipeline for unoptimized (Onone) builds
- Optimization pipeline required before serialization
- Optimization pipeline that handles @semantic and @effects functions
  (the most powerful optimizations need to go here)
- Optimization pipeline that further reduces the inlined implementation
  of (non-nested) @semantic and @effects functions
  (this last phase has less information from analyses)
  (ultimiately this last phase will be split between OSSA/non-OSSA pipelines)
A single decision to inline can only expose more
optimization/analyses opportunities (assuming it doesn't block
further inlinining because of code size)
Inlining cannot prevent an optimization pass from processing a given
region of code.
A @semantic call site will always be processed by the optimization
pass that operates on those semantics regardless of inlining
decisions.

This PR is only one small step toward these principles.

This PR currently exposes a bug where SimplifyCFG does not update the dominator tree. This will be hit be the unit test:
SILOptimizer/specialize_opaque_type_archetypes.swift

Module passes need to be in a separate pipeline, otherwise the pipeline restart mechanism will be broken. This makes GlobalOpt and serialization run earlier in the pipeline. There's no explicit reason for them to be run later, in the middle of a function pass pipeline. Also, pipeline boundaries, like serialization and module passes should be explicit at the the top level function that creates the pass pipelines.

Don't allow module passes to be inserted within a function pass pipeline. This silently breaks the function pipeline both interfering with analysis and the normal pipeline restart mechanism.

Not all simplifications correctly invalidate CFG analyses.

airspeedswift · 2020-04-28T21:31:18Z

@swift-ci please benchmark

swift-ci · 2020-04-28T22:34:36Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
ArrayPlusEqualSingleElementCollection	799	4747	+494.1%	0.17x
ObjectiveCBridgeStubFromNSDateRef	4810	5810	+20.8%	0.83x (?)
FlattenListLoop	4451	4967	+11.6%	0.90x (?)
ObserverForwarderStruct	910	1015	+11.5%	0.90x (?)
Prims.NonStrongRef.UnownedSafe.Closure	795	883	+11.1%	0.90x (?)
Prims.NonStrongRef.UnownedSafe	795	882	+10.9%	0.90x (?)
DictionaryOfAnyHashableStrings_insert	5054	5586	+10.5%	0.90x (?)
ObserverUnappliedMethod	1200	1300	+8.3%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
ArrayPlusEqualFiveElementCollection	8436	5106	-39.5%	1.65x
String.data.LargeUnicode	124	102	-17.7%	1.22x (?)
SubstringEquatable	866	750	-13.4%	1.15x
ArrayAppendLazyMap	7030	6090	-13.4%	1.15x (?)
LazilyFilteredArrayContains	41400	36300	-12.3%	1.14x (?)
RemoveWhereMoveInts	36	33	-8.3%	1.09x (?)
MapReduce	371	342	-7.8%	1.08x (?)
RemoveWhereSwapInts	64	59	-7.8%	1.08x (?)
ParseInt.UInt64.Decimal	253	235	-7.1%	1.08x (?)
MapReduceAnyCollection	397	369	-7.1%	1.08x (?)

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
Array2D.o	2984	3016	+1.1%	0.99x

Improvement	OLD	NEW	DELTA	RATIO
PrimsNonStrongRef.o	121112	119344	-1.5%	1.01x
LazyFilter.o	8384	8296	-1.0%	1.01x

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
ArrayPlusEqualSingleElementCollection	846	4653	+450.0%	0.18x
ObjectiveCBridgeStubToNSDateRef	3500	4020	+14.9%	0.87x (?)
Combos	490	562	+14.7%	0.87x (?)
ObserverForwarderStruct	870	975	+12.1%	0.89x (?)
Prims.NonStrongRef.UnownedSafe.Closure	786	873	+11.1%	0.90x (?)
Prims.NonStrongRef.UnownedSafe	786	873	+11.1%	0.90x (?)
DictionaryOfAnyHashableStrings_insert	5110	5628	+10.1%	0.91x (?)
DataCountSmall	31	34	+9.7%	0.91x (?)
UTF8Decode_InitFromCustom_contiguous_ascii_as_ascii	401	437	+9.0%	0.92x (?)
ObserverUnappliedMethod	1220	1320	+8.2%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
SubstringEquatable	875	734	-16.1%	1.19x (?)
ObjectiveCBridgeFromNSSetAnyObjectToStringForced	96500	84500	-12.4%	1.14x (?)
ArrayAppendLazyMap	7900	6990	-11.5%	1.13x (?)
ObjectiveCBridgeStubToNSDate2	620	560	-9.7%	1.11x (?)
UTF8Decode_InitFromBytes_ascii	348	319	-8.3%	1.09x (?)
DictionaryBridgeToObjC_Access	851	781	-8.2%	1.09x (?)
RemoveWhereMoveInts	37	34	-8.1%	1.09x (?)
ArrayPlusEqualFiveElementCollection	8325	7696	-7.6%	1.08x (?)
RemoveWhereSwapInts	67	62	-7.5%	1.08x
FlattenListLoop	5193	4841	-6.8%	1.07x (?)
MapReduce	435	406	-6.7%	1.07x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
ArrayOfPOD.o	3286	3558	+8.3%	0.92x
RecursiveOwnedParameter.o	1471	1551	+5.4%	0.95x

Improvement	OLD	NEW	DELTA	RATIO
SequenceAlgos.o	23463	22103	-5.8%	1.06x
DropFirst.o	19631	18975	-3.3%	1.03x
DataBenchmarks.o	57542	56829	-1.2%	1.01x
NibbleSort.o	13000	12856	-1.1%	1.01x
LazyFilter.o	8011	7923	-1.1%	1.01x
ArrayAppend.o	23159	22911	-1.1%	1.01x

Performance: -Onone

Regression	OLD	NEW	DELTA	RATIO
DictionaryRemove	12300	13750	+11.8%	0.89x (?)
CharIndexing_japanese_unicodeScalars_Backwards	756440	821400	+8.6%	0.92x (?)
Dict.CopyKeyValue.24k	5365	5822	+8.5%	0.92x (?)
DictionarySwap	4352	4716	+8.4%	0.92x (?)
Dict.CopyKeyValue.20k	4614	4984	+8.0%	0.93x (?)
ExclusivityIndependent	91	98	+7.7%	0.93x (?)
Dict.CopyKeyValue.16k	3914	4209	+7.5%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
DictionarySubscriptDefaultMutation	12215	4388	-64.1%	2.78x
DictionarySubscriptDefaultMutationArray	12648	4721	-62.7%	2.68x
FrequenciesUsingReduceInto	9970	4080	-59.1%	2.44x
DictionarySubscriptDefaultMutationOfObjects	17220	8640	-49.8%	1.99x
DictionarySubscriptDefaultMutationArrayOfObjects	19680	10220	-48.1%	1.93x
FrequenciesUsingReduce	15620	9770	-37.5%	1.60x
Dictionary3OfObjects	1568	1463	-6.7%	1.07x (?)

Code size: -swiftlibs

Improvement	OLD	NEW	DELTA	RATIO
libswiftCore.dylib	4128768	4059136	-1.7%	1.02x

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac Pro
  Model Identifier: MacPro6,1
  Processor Name: 12-Core Intel Xeon E5
  Processor Speed: 2.7 GHz
  Number of Processors: 1
  Total Number of Cores: 12
  L2 Cache (per Core): 256 KB
  L3 Cache: 30 MB
  Memory: 64 GB

atrick · 2020-05-01T21:51:38Z

@meg-gupta took over this PR. Thanks.

atrick added 3 commits April 20, 2020 15:29

SILOptimizer: Add enforcement of function-pass pipelines.

bf0814d

Don't allow module passes to be inserted within a function pass pipeline. This silently breaks the function pipeline both interfering with analysis and the normal pipeline restart mechanism.

WIP: Fix SimplifyCFG analysis invalidation.

ebf6411

Not all simplifications correctly invalidate CFG analyses.

atrick requested a review from meg-gupta April 20, 2020 22:55

atrick mentioned this pull request Apr 20, 2020

Fix the mid-level pass pipeline #22445

Closed

meg-gupta mentioned this pull request Apr 29, 2020

Fix the mid-level function-pass pipeline #31424

Merged

atrick closed this May 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Fix the mid-level function-pass pipeline #31163

[WIP] Fix the mid-level function-pass pipeline #31163

Uh oh!

atrick commented Apr 20, 2020

Uh oh!

airspeedswift commented Apr 28, 2020

Uh oh!

swift-ci commented Apr 28, 2020

Uh oh!

atrick commented May 1, 2020

Uh oh!

Uh oh!

[WIP] Fix the mid-level function-pass pipeline #31163

[WIP] Fix the mid-level function-pass pipeline #31163

Uh oh!

Conversation

atrick commented Apr 20, 2020

Uh oh!

airspeedswift commented Apr 28, 2020

Uh oh!

swift-ci commented Apr 28, 2020

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

atrick commented May 1, 2020

Uh oh!

Uh oh!