EscapeAnalysis: make the use-point analysis more precise #28502

eeckstein · 2019-11-28T14:28:55Z

In canEscapeToUsePoint only check the content node if it's a reference (see comment why this is needed).
For all other node types, especially addresses, handle defer edges by propagating use-point infomation backward in the graph.
This makes escape analysis more precise with address types, e.g. don't consider an inout address to escape to an apply if just the loaded value is passed to an apply argument.

This PR is a re-base and re-apply of #27667 after @atrick's escape analysis changes.

In canEscapeToUsePoint only check the content node if it's a reference (see comment why this is needed). For all other node types, especially addresses, handle defer edges by propagating use-point infomation backward in the graph. This makes escape analysis more precise with address types, e.g. don't consider an inout address to escape to an apply if just the loaded value is passed to an apply argument.

eeckstein · 2019-11-28T14:29:07Z

@swift-ci test

eeckstein · 2019-11-28T14:29:20Z

@swift-ci benchmark

swift-ci · 2019-11-28T14:51:59Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
SuffixArray	5	9	+80.0%	0.56x
DropWhileArray	24	35	+45.8%	0.69x
ObjectiveCBridgeStubNSDateRefAccess	174	196	+12.6%	0.89x (?)

Improvement	OLD	NEW	DELTA	RATIO
PrefixAnySeqCRangeIterLazy	41	14	-65.9%	2.93x
PrefixAnySeqCntRangeLazy	41	14	-65.9%	2.93x
DropWhileAnySeqCntRange	36	15	-58.3%	2.40x
DropWhileAnySeqCRangeIter	36	15	-58.3%	2.40x
PrefixWhileAnyCollection	58	26	-55.2%	2.23x
PrefixAnySeqCRangeIter	40	21	-47.5%	1.90x
PrefixWhileAnySeqCRangeIterLazy	40	25	-37.5%	1.60x
DropFirstAnySeqCRangeIterLazy	35	22	-37.1%	1.59x
ObjectiveCBridgeStubDataAppend	4100	2660	-35.1%	1.54x
DropFirstAnySeqCRangeIter	35	23	-34.3%	1.52x
DropFirstAnySeqCntRangeLazy	33	22	-33.3%	1.50x
DropWhileAnyCollection	38	26	-31.6%	1.46x
DropFirstAnySeqCntRange	33	23	-30.3%	1.43x
SetSymmetricDifferenceInt100	143	103	-28.0%	1.39x
Data.append.Sequence.64kB.Count0.RE	251	181	-27.9%	1.39x
DataSetCountSmall	80	58	-27.5%	1.38x
Data.init.Sequence.64kB.Count0.RE.I	244	179	-26.6%	1.36x (?)
Data.append.Sequence.64kB.Count0.RE.I	245	180	-26.5%	1.36x
Data.init.Sequence.64kB.Count0.RE	244	181	-25.8%	1.35x
DropWhileAnyCollectionLazy	63	48	-23.8%	1.31x
DropWhileAnySeqCntRangeLazy	63	48	-23.8%	1.31x
Data.init.Sequence.64kB.Count0	236	180	-23.7%	1.31x
Data.append.Sequence.64kB.Count0.I	236	180	-23.7%	1.31x (?)
Data.init.Sequence.64kB.Count0.I	236	182	-22.9%	1.30x
PrefixWhileArray	53	41	-22.6%	1.29x
Data.append.Sequence.64kB.Count0	236	183	-22.5%	1.29x (?)
Data.append.Sequence.809B.Count0.RE	369	287	-22.2%	1.29x
EqualSubstringSubstring	27	21	-22.2%	1.29x
LessSubstringSubstring	27	21	-22.2%	1.29x
EqualSubstringSubstringGenericEquatable	27	21	-22.2%	1.29x
EqualSubstringString	27	21	-22.2%	1.29x
LessSubstringSubstringGenericComparable	27	21	-22.2%	1.29x
PrefixAnySeqCntRange	27	21	-22.2%	1.29x (?)
EqualStringSubstring	28	22	-21.4%	1.27x
Data.append.Sequence.809B.Count0.RE.I	366	288	-21.3%	1.27x
Dictionary4	194	154	-20.6%	1.26x
DropFirstAnyCollection	31	25	-19.4%	1.24x (?)
Data.init.Sequence.2049B.Count0.I	430	350	-18.6%	1.23x
Data.init.Sequence.2047B.Count0.I	429	351	-18.2%	1.22x
DictionarySwapAt	728	596	-18.1%	1.22x
Data.append.Sequence.809B.Count0	354	290	-18.1%	1.22x (?)
Data.append.Sequence.809B.Count0.I	354	290	-18.1%	1.22x (?)
Data.init.Sequence.809B.Count0.RE.I	400	331	-17.2%	1.21x
Data.init.Sequence.809B.Count0.RE	398	331	-16.8%	1.20x
SuffixAnyCollection	12	10	-16.7%	1.20x
DropLastAnyCollection	12	10	-16.7%	1.20x
PrefixAnyCollection	31	26	-16.1%	1.19x (?)
Data.init.Sequence.809B.Count0	394	331	-16.0%	1.19x
Data.init.Sequence.809B.Count0.I	395	333	-15.7%	1.19x (?)
SetSymmetricDifferenceInt50	137	116	-15.3%	1.18x
DropWhileAnySeqCRangeIterLazy	56	48	-14.3%	1.17x
Data.init.Sequence.511B.Count0.I	423	363	-14.2%	1.17x
Data.init.Sequence.513B.Count0.I	424	365	-13.9%	1.16x
SetSymmetricDifferenceBox25	350	305	-12.9%	1.15x
UTF8Decode_InitDecoding	119	105	-11.8%	1.13x (?)
SetUnionBox25	275	243	-11.6%	1.13x
SetUnion_OfObjects	4100	3660	-10.7%	1.12x
SetUnionBox0	410	366	-10.7%	1.12x
WordCountUniqueASCII	1430	1280	-10.5%	1.12x (?)
DropFirstAnySequence	1978	1774	-10.3%	1.11x
SetSymmetricDifferenceBox0	468	426	-9.0%	1.10x (?)
WordCountUniqueUTF16	1610	1470	-8.7%	1.10x
SetExclusiveOr_OfObjects	4670	4270	-8.6%	1.09x
Phonebook	1204	1106	-8.1%	1.09x (?)
DictionaryRemoveOfObjects	16000	14700	-8.1%	1.09x (?)
SetSymmetricDifferenceInt25	127	117	-7.9%	1.09x (?)
DataMutateBytesMedium	2080	1920	-7.7%	1.08x (?)
PrefixWhileAnyCollectionLazy	27	25	-7.4%	1.08x
PrefixWhileAnySeqCntRangeLazy	27	25	-7.4%	1.08x
SetSubtractingBox25	198	185	-6.6%	1.07x (?)
SetIntersectionBox25	183	171	-6.6%	1.07x (?)

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
ObjectiveCBridgingStubs.o	18941	20285	+7.1%	0.93x
StringRemoveDupes.o	6990	7118	+1.8%	0.98x
ReduceInto.o	15346	15506	+1.0%	0.99x

Improvement	OLD	NEW	DELTA	RATIO
DropFirst.o	22224	19628	-11.7%	1.13x
DropWhile.o	20970	18646	-11.1%	1.12x
Prefix.o	21433	19205	-10.4%	1.12x
PrefixWhile.o	20064	18876	-5.9%	1.06x
DictTest3.o	21615	20943	-3.1%	1.03x
DictTest.o	19182	18670	-2.7%	1.03x
RangeReplaceableCollectionPlusDefault.o	5868	5772	-1.6%	1.02x
StringMatch.o	4752	4688	-1.3%	1.01x
DictTest2.o	14447	14303	-1.0%	1.01x

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
SuffixAnyCollection	32	37	+15.6%	0.86x
ArrayAppendSequence	600	680	+13.3%	0.88x (?)
Calculator	140	155	+10.7%	0.90x (?)
DictOfArraysToArrayOfDicts	592	641	+8.3%	0.92x (?)
ArrayAppendLatin1	1258	1360	+8.1%	0.93x (?)
ArrayAppendUTF16	1258	1360	+8.1%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
DropLastArrayLazy	9	4	-55.5%	2.25x
PrefixAnyCollection	113	72	-36.3%	1.57x
DropLastCountableRangeLazy	9	6	-33.3%	1.50x
DropWhileAnyCollection	122	82	-32.8%	1.49x
DropWhileAnySeqCntRangeLazy	179	123	-31.3%	1.46x
PrefixWhileAnyCollection	134	96	-28.4%	1.40x
ObjectiveCBridgeStubDataAppend	4040	2920	-27.7%	1.38x
DataSetCountSmall	80	58	-27.5%	1.38x
Data.append.Sequence.64kB.Count0.RE	253	185	-26.9%	1.37x
Data.init.Sequence.64kB.Count0.I	232	170	-26.7%	1.36x
Data.init.Sequence.64kB.Count0	232	171	-26.3%	1.36x
Data.append.Sequence.64kB.Count0	233	172	-26.2%	1.35x
Data.append.Sequence.64kB.Count0.RE.I	252	187	-25.8%	1.35x
Data.append.Sequence.64kB.Count0.I	229	171	-25.3%	1.34x
Data.init.Sequence.64kB.Count0.RE.I	250	189	-24.4%	1.32x
DropFirstAnySeqCntRangeLazy	136	103	-24.3%	1.32x
Data.init.Sequence.64kB.Count0.RE	250	190	-24.0%	1.32x
DropWhileAnySeqCRangeIterLazy	161	123	-23.6%	1.31x
Data.append.Sequence.809B.Count0.RE	379	294	-22.4%	1.29x
LessSubstringSubstring	27	21	-22.2%	1.29x (?)
EqualSubstringString	27	21	-22.2%	1.29x
LessSubstringSubstringGenericComparable	27	21	-22.2%	1.29x
Data.append.Sequence.809B.Count0.RE.I	376	295	-21.5%	1.27x
EqualSubstringSubstring	28	22	-21.4%	1.27x
EqualStringSubstring	28	22	-21.4%	1.27x
EqualSubstringSubstringGenericEquatable	28	22	-21.4%	1.27x
Data.append.Sequence.809B.Count0.I	349	275	-21.2%	1.27x
Set.isStrictSuperset.Seq.Box0	12399	9787	-21.1%	1.27x
Set.isStrictSubset.Seq.Box0	1244	982	-21.1%	1.27x
Set.isStrictSuperset.Seq.Box25	1249	986	-21.1%	1.27x
Set.isSubset.Seq.Box0	1242	982	-20.9%	1.26x
Data.append.Sequence.809B.Count0	349	276	-20.9%	1.26x
Data.init.Sequence.2047B.Count0.I	422	336	-20.4%	1.26x
Data.init.Sequence.2049B.Count0.I	422	338	-19.9%	1.25x
Data.init.Sequence.809B.Count0.RE	415	336	-19.0%	1.24x
Data.init.Sequence.809B.Count0.RE.I	418	339	-18.9%	1.23x
DictionarySwapAt	732	596	-18.6%	1.23x
Set.isSubset.Seq.Box25	1441	1177	-18.3%	1.22x
Set.isStrictSubset.Seq.Box25	1441	1178	-18.3%	1.22x
Data.init.Sequence.809B.Count0	387	321	-17.1%	1.21x
Data.init.Sequence.809B.Count0.I	389	328	-15.7%	1.19x
Data.init.Sequence.511B.Count0.I	422	357	-15.4%	1.18x (?)
SortIntPyramid	530	450	-15.1%	1.18x
Data.init.Sequence.513B.Count0.I	413	353	-14.5%	1.17x (?)
Set.intersection.Seq.Box0	368	316	-14.1%	1.16x
PrefixAnySeqCRangeIterLazy	94	81	-13.8%	1.16x (?)
PrefixWhileAnySeqCRangeIterLazy	94	81	-13.8%	1.16x (?)
DropWhileAnyCollectionLazy	135	117	-13.3%	1.15x
Set.intersection.Seq.Box25	479	416	-13.2%	1.15x
DataMutateBytesSmall	160	140	-12.5%	1.14x (?)
UTF8Decode_InitDecoding	120	105	-12.5%	1.14x
WordCountUniqueASCII	1450	1270	-12.4%	1.14x (?)
ObjectiveCBridgeStubNSDateRefAccess	196	174	-11.2%	1.13x (?)
SetUnionBox25	329	294	-10.6%	1.12x
StringComparison_longSharedPrefix	355	320	-9.9%	1.11x
WordCountUniqueUTF16	1630	1470	-9.8%	1.11x
SetUnion_OfObjects	4690	4230	-9.8%	1.11x
SetUnionBox0	469	423	-9.8%	1.11x
SetSymmetricDifferenceBox25	493	445	-9.7%	1.11x (?)
Phonebook	1281	1162	-9.3%	1.10x
SetExclusiveOr_OfObjects	6230	5740	-7.9%	1.09x (?)
SetSymmetricDifferenceBox0	623	574	-7.9%	1.09x (?)
DataSetCountMedium	420	390	-7.1%	1.08x (?)
DataMutateBytesMedium	2080	1940	-6.7%	1.07x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
CSVParsing.o	54840	56976	+3.9%	0.96x
BucketSort.o	11183	11423	+2.1%	0.98x
SortLettersInPlace.o	8259	8371	+1.4%	0.99x

Improvement	OLD	NEW	DELTA	RATIO
PrefixWhile.o	18472	17620	-4.6%	1.05x
DropWhile.o	19132	18268	-4.5%	1.05x
Prefix.o	19452	18588	-4.4%	1.05x
StringMatch.o	4572	4476	-2.1%	1.02x

Performance: -Onone

Improvement	OLD	NEW	DELTA	RATIO
ObjectiveCBridgeStubDataAppend	4400	3220	-26.8%	1.37x
DataSetCountSmall	106	84	-20.8%	1.26x
EqualSubstringSubstringGenericEquatable	31	25	-19.4%	1.24x (?)
LessSubstringSubstringGenericComparable	31	25	-19.4%	1.24x
EqualSubstringSubstring	32	26	-18.7%	1.23x (?)
LessSubstringSubstring	32	26	-18.7%	1.23x
EqualStringSubstring	32	26	-18.7%	1.23x
EqualSubstringString	32	27	-15.6%	1.19x
SetSubtractingInt100	376	336	-10.6%	1.12x (?)
UTF8Decode_InitFromData_ascii	179	163	-8.9%	1.10x (?)
UTF8Decode_InitDecoding	136	124	-8.8%	1.10x (?)
DictionaryRemove	8550	7800	-8.8%	1.10x (?)
Set.filter.Int100.24k	2772	2558	-7.7%	1.08x
Set.filter.Int100.20k	2358	2180	-7.5%	1.08x (?)
Set.filter.Int100.16k	1989	1847	-7.1%	1.08x (?)
Set.filter.Int50.24k	1691	1575	-6.9%	1.07x (?)
Set.filter.Int100.28k	3633	3386	-6.8%	1.07x (?)
Set.filter.Int50.20k	1435	1339	-6.7%	1.07x

Code size: -swiftlibs

Improvement	OLD	NEW	DELTA	RATIO
libswiftSwiftOnoneSupport.dylib	180224	176128	-2.3%	1.02x

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

atrick · 2019-12-04T18:29:48Z

@eeckstein
I originally wanted to merge the CGNode hasRC flag optimistically. We can still do that, but then we need a rule about which CGNodes can be safely merged. Currently, I think we could arbitrarily merge any (or all) content nodes and it would be conservatively correct, which is a nice property. So, this is a valid option, but does make the connection graph conceptually more complex.

canEscapeTo makes assumptions in two directions: how the connection graph was built and how the API is used, so let's look at both...

This PR assumes that CG nodes are all separate physical objects (from the perspective of SIL pointer conversion/casting/projections) except when the node corresponds to the properties of a class, in which case there is exactly one interior node. It might be possible to make that assumption here, it just isn't formalized in the representation (aside from the hasRC flag) or enforced, or even specified anywhere yet. Even if we prove that the graph is always built that way, we have to be sure that merging content nodes doesn't change the structure. If we get the same benefit without making such assumptions that's better because it's conceptually simpler and the escape analysis representation could be changed later without breaking things in confusing ways.

In the other direction, this PR assumes that 'V' will always be a "reference type" when its mapped CGNode represents any memory that might be reachable from a reference. Given all the ways that addresses, raw pointers and references can be converted in SIL it's hard for me to be convinced that it's true. I think that at least requires more clarity on the alias analysis side to ensure that it can always "see through" all an object's interior pointers and never "sees past" a reference type back to a non-reference pointer or address.

The the way I wanted to handle this case is just by making one obvious assumption in each direction:

a CGNode with the RC flag set always represents the start of a heap object (that's the whole point of the flag)
The value 'V' that alias analysis provides points to the same a physical object as any memory accesses that are being considered.

#28564

But, there's a problem with that approach. The question is whether the problem is specific to my approach or whether I'm just exposing it. I think the alias analysis query just needs to be adjusted. I'll explain the issue there.

eeckstein · 2020-02-14T12:46:41Z

Already handled with Andy's escape analysis changes

eeckstein requested a review from atrick November 28, 2019 14:28

eeckstein closed this Feb 14, 2020

eeckstein deleted the escape-analysis-use-points branch February 14, 2020 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

EscapeAnalysis: make the use-point analysis more precise #28502

EscapeAnalysis: make the use-point analysis more precise #28502

Uh oh!

eeckstein commented Nov 28, 2019

Uh oh!

eeckstein commented Nov 28, 2019

Uh oh!

eeckstein commented Nov 28, 2019

Uh oh!

swift-ci commented Nov 28, 2019

Uh oh!

atrick commented Dec 4, 2019

Uh oh!

eeckstein commented Feb 14, 2020

Uh oh!

Uh oh!

EscapeAnalysis: make the use-point analysis more precise #28502

EscapeAnalysis: make the use-point analysis more precise #28502

Uh oh!

Conversation

eeckstein commented Nov 28, 2019

Uh oh!

eeckstein commented Nov 28, 2019

Uh oh!

eeckstein commented Nov 28, 2019

Uh oh!

swift-ci commented Nov 28, 2019

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

atrick commented Dec 4, 2019

Uh oh!

eeckstein commented Feb 14, 2020

Uh oh!

Uh oh!