Skip to content

EscapeAnalysis: make the use-point analysis more precise #28502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

eeckstein
Copy link
Contributor

In canEscapeToUsePoint only check the content node if it's a reference (see comment why this is needed).
For all other node types, especially addresses, handle defer edges by propagating use-point infomation backward in the graph.
This makes escape analysis more precise with address types, e.g. don't consider an inout address to escape to an apply if just the loaded value is passed to an apply argument.

This PR is a re-base and re-apply of #27667 after @atrick's escape analysis changes.

In canEscapeToUsePoint only check the content node if it's a reference (see comment why this is needed).
For all other node types, especially addresses, handle defer edges by propagating use-point infomation backward in the graph.
This makes escape analysis more precise with address types, e.g. don't consider an inout address to escape to an apply if just the loaded value is passed to an apply argument.
@eeckstein eeckstein requested a review from atrick November 28, 2019 14:28
@eeckstein
Copy link
Contributor Author

@swift-ci test

@eeckstein
Copy link
Contributor Author

@swift-ci benchmark

@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
SuffixArray 5 9 +80.0% 0.56x
DropWhileArray 24 35 +45.8% 0.69x
ObjectiveCBridgeStubNSDateRefAccess 174 196 +12.6% 0.89x (?)
 
Improvement OLD NEW DELTA RATIO
PrefixAnySeqCRangeIterLazy 41 14 -65.9% 2.93x
PrefixAnySeqCntRangeLazy 41 14 -65.9% 2.93x
DropWhileAnySeqCntRange 36 15 -58.3% 2.40x
DropWhileAnySeqCRangeIter 36 15 -58.3% 2.40x
PrefixWhileAnyCollection 58 26 -55.2% 2.23x
PrefixAnySeqCRangeIter 40 21 -47.5% 1.90x
PrefixWhileAnySeqCRangeIterLazy 40 25 -37.5% 1.60x
DropFirstAnySeqCRangeIterLazy 35 22 -37.1% 1.59x
ObjectiveCBridgeStubDataAppend 4100 2660 -35.1% 1.54x
DropFirstAnySeqCRangeIter 35 23 -34.3% 1.52x
DropFirstAnySeqCntRangeLazy 33 22 -33.3% 1.50x
DropWhileAnyCollection 38 26 -31.6% 1.46x
DropFirstAnySeqCntRange 33 23 -30.3% 1.43x
SetSymmetricDifferenceInt100 143 103 -28.0% 1.39x
Data.append.Sequence.64kB.Count0.RE 251 181 -27.9% 1.39x
DataSetCountSmall 80 58 -27.5% 1.38x
Data.init.Sequence.64kB.Count0.RE.I 244 179 -26.6% 1.36x (?)
Data.append.Sequence.64kB.Count0.RE.I 245 180 -26.5% 1.36x
Data.init.Sequence.64kB.Count0.RE 244 181 -25.8% 1.35x
DropWhileAnyCollectionLazy 63 48 -23.8% 1.31x
DropWhileAnySeqCntRangeLazy 63 48 -23.8% 1.31x
Data.init.Sequence.64kB.Count0 236 180 -23.7% 1.31x
Data.append.Sequence.64kB.Count0.I 236 180 -23.7% 1.31x (?)
Data.init.Sequence.64kB.Count0.I 236 182 -22.9% 1.30x
PrefixWhileArray 53 41 -22.6% 1.29x
Data.append.Sequence.64kB.Count0 236 183 -22.5% 1.29x (?)
Data.append.Sequence.809B.Count0.RE 369 287 -22.2% 1.29x
EqualSubstringSubstring 27 21 -22.2% 1.29x
LessSubstringSubstring 27 21 -22.2% 1.29x
EqualSubstringSubstringGenericEquatable 27 21 -22.2% 1.29x
EqualSubstringString 27 21 -22.2% 1.29x
LessSubstringSubstringGenericComparable 27 21 -22.2% 1.29x
PrefixAnySeqCntRange 27 21 -22.2% 1.29x (?)
EqualStringSubstring 28 22 -21.4% 1.27x
Data.append.Sequence.809B.Count0.RE.I 366 288 -21.3% 1.27x
Dictionary4 194 154 -20.6% 1.26x
DropFirstAnyCollection 31 25 -19.4% 1.24x (?)
Data.init.Sequence.2049B.Count0.I 430 350 -18.6% 1.23x
Data.init.Sequence.2047B.Count0.I 429 351 -18.2% 1.22x
DictionarySwapAt 728 596 -18.1% 1.22x
Data.append.Sequence.809B.Count0 354 290 -18.1% 1.22x (?)
Data.append.Sequence.809B.Count0.I 354 290 -18.1% 1.22x (?)
Data.init.Sequence.809B.Count0.RE.I 400 331 -17.2% 1.21x
Data.init.Sequence.809B.Count0.RE 398 331 -16.8% 1.20x
SuffixAnyCollection 12 10 -16.7% 1.20x
DropLastAnyCollection 12 10 -16.7% 1.20x
PrefixAnyCollection 31 26 -16.1% 1.19x (?)
Data.init.Sequence.809B.Count0 394 331 -16.0% 1.19x
Data.init.Sequence.809B.Count0.I 395 333 -15.7% 1.19x (?)
SetSymmetricDifferenceInt50 137 116 -15.3% 1.18x
DropWhileAnySeqCRangeIterLazy 56 48 -14.3% 1.17x
Data.init.Sequence.511B.Count0.I 423 363 -14.2% 1.17x
Data.init.Sequence.513B.Count0.I 424 365 -13.9% 1.16x
SetSymmetricDifferenceBox25 350 305 -12.9% 1.15x
UTF8Decode_InitDecoding 119 105 -11.8% 1.13x (?)
SetUnionBox25 275 243 -11.6% 1.13x
SetUnion_OfObjects 4100 3660 -10.7% 1.12x
SetUnionBox0 410 366 -10.7% 1.12x
WordCountUniqueASCII 1430 1280 -10.5% 1.12x (?)
DropFirstAnySequence 1978 1774 -10.3% 1.11x
SetSymmetricDifferenceBox0 468 426 -9.0% 1.10x (?)
WordCountUniqueUTF16 1610 1470 -8.7% 1.10x
SetExclusiveOr_OfObjects 4670 4270 -8.6% 1.09x
Phonebook 1204 1106 -8.1% 1.09x (?)
DictionaryRemoveOfObjects 16000 14700 -8.1% 1.09x (?)
SetSymmetricDifferenceInt25 127 117 -7.9% 1.09x (?)
DataMutateBytesMedium 2080 1920 -7.7% 1.08x (?)
PrefixWhileAnyCollectionLazy 27 25 -7.4% 1.08x
PrefixWhileAnySeqCntRangeLazy 27 25 -7.4% 1.08x
SetSubtractingBox25 198 185 -6.6% 1.07x (?)
SetIntersectionBox25 183 171 -6.6% 1.07x (?)

Code size: -O

Regression OLD NEW DELTA RATIO
ObjectiveCBridgingStubs.o 18941 20285 +7.1% 0.93x
StringRemoveDupes.o 6990 7118 +1.8% 0.98x
ReduceInto.o 15346 15506 +1.0% 0.99x
 
Improvement OLD NEW DELTA RATIO
DropFirst.o 22224 19628 -11.7% 1.13x
DropWhile.o 20970 18646 -11.1% 1.12x
Prefix.o 21433 19205 -10.4% 1.12x
PrefixWhile.o 20064 18876 -5.9% 1.06x
DictTest3.o 21615 20943 -3.1% 1.03x
DictTest.o 19182 18670 -2.7% 1.03x
RangeReplaceableCollectionPlusDefault.o 5868 5772 -1.6% 1.02x
StringMatch.o 4752 4688 -1.3% 1.01x
DictTest2.o 14447 14303 -1.0% 1.01x

Performance: -Osize

Regression OLD NEW DELTA RATIO
SuffixAnyCollection 32 37 +15.6% 0.86x
ArrayAppendSequence 600 680 +13.3% 0.88x (?)
Calculator 140 155 +10.7% 0.90x (?)
DictOfArraysToArrayOfDicts 592 641 +8.3% 0.92x (?)
ArrayAppendLatin1 1258 1360 +8.1% 0.93x (?)
ArrayAppendUTF16 1258 1360 +8.1% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
DropLastArrayLazy 9 4 -55.5% 2.25x
PrefixAnyCollection 113 72 -36.3% 1.57x
DropLastCountableRangeLazy 9 6 -33.3% 1.50x
DropWhileAnyCollection 122 82 -32.8% 1.49x
DropWhileAnySeqCntRangeLazy 179 123 -31.3% 1.46x
PrefixWhileAnyCollection 134 96 -28.4% 1.40x
ObjectiveCBridgeStubDataAppend 4040 2920 -27.7% 1.38x
DataSetCountSmall 80 58 -27.5% 1.38x
Data.append.Sequence.64kB.Count0.RE 253 185 -26.9% 1.37x
Data.init.Sequence.64kB.Count0.I 232 170 -26.7% 1.36x
Data.init.Sequence.64kB.Count0 232 171 -26.3% 1.36x
Data.append.Sequence.64kB.Count0 233 172 -26.2% 1.35x
Data.append.Sequence.64kB.Count0.RE.I 252 187 -25.8% 1.35x
Data.append.Sequence.64kB.Count0.I 229 171 -25.3% 1.34x
Data.init.Sequence.64kB.Count0.RE.I 250 189 -24.4% 1.32x
DropFirstAnySeqCntRangeLazy 136 103 -24.3% 1.32x
Data.init.Sequence.64kB.Count0.RE 250 190 -24.0% 1.32x
DropWhileAnySeqCRangeIterLazy 161 123 -23.6% 1.31x
Data.append.Sequence.809B.Count0.RE 379 294 -22.4% 1.29x
LessSubstringSubstring 27 21 -22.2% 1.29x (?)
EqualSubstringString 27 21 -22.2% 1.29x
LessSubstringSubstringGenericComparable 27 21 -22.2% 1.29x
Data.append.Sequence.809B.Count0.RE.I 376 295 -21.5% 1.27x
EqualSubstringSubstring 28 22 -21.4% 1.27x
EqualStringSubstring 28 22 -21.4% 1.27x
EqualSubstringSubstringGenericEquatable 28 22 -21.4% 1.27x
Data.append.Sequence.809B.Count0.I 349 275 -21.2% 1.27x
Set.isStrictSuperset.Seq.Box0 12399 9787 -21.1% 1.27x
Set.isStrictSubset.Seq.Box0 1244 982 -21.1% 1.27x
Set.isStrictSuperset.Seq.Box25 1249 986 -21.1% 1.27x
Set.isSubset.Seq.Box0 1242 982 -20.9% 1.26x
Data.append.Sequence.809B.Count0 349 276 -20.9% 1.26x
Data.init.Sequence.2047B.Count0.I 422 336 -20.4% 1.26x
Data.init.Sequence.2049B.Count0.I 422 338 -19.9% 1.25x
Data.init.Sequence.809B.Count0.RE 415 336 -19.0% 1.24x
Data.init.Sequence.809B.Count0.RE.I 418 339 -18.9% 1.23x
DictionarySwapAt 732 596 -18.6% 1.23x
Set.isSubset.Seq.Box25 1441 1177 -18.3% 1.22x
Set.isStrictSubset.Seq.Box25 1441 1178 -18.3% 1.22x
Data.init.Sequence.809B.Count0 387 321 -17.1% 1.21x
Data.init.Sequence.809B.Count0.I 389 328 -15.7% 1.19x
Data.init.Sequence.511B.Count0.I 422 357 -15.4% 1.18x (?)
SortIntPyramid 530 450 -15.1% 1.18x
Data.init.Sequence.513B.Count0.I 413 353 -14.5% 1.17x (?)
Set.intersection.Seq.Box0 368 316 -14.1% 1.16x
PrefixAnySeqCRangeIterLazy 94 81 -13.8% 1.16x (?)
PrefixWhileAnySeqCRangeIterLazy 94 81 -13.8% 1.16x (?)
DropWhileAnyCollectionLazy 135 117 -13.3% 1.15x
Set.intersection.Seq.Box25 479 416 -13.2% 1.15x
DataMutateBytesSmall 160 140 -12.5% 1.14x (?)
UTF8Decode_InitDecoding 120 105 -12.5% 1.14x
WordCountUniqueASCII 1450 1270 -12.4% 1.14x (?)
ObjectiveCBridgeStubNSDateRefAccess 196 174 -11.2% 1.13x (?)
SetUnionBox25 329 294 -10.6% 1.12x
StringComparison_longSharedPrefix 355 320 -9.9% 1.11x
WordCountUniqueUTF16 1630 1470 -9.8% 1.11x
SetUnion_OfObjects 4690 4230 -9.8% 1.11x
SetUnionBox0 469 423 -9.8% 1.11x
SetSymmetricDifferenceBox25 493 445 -9.7% 1.11x (?)
Phonebook 1281 1162 -9.3% 1.10x
SetExclusiveOr_OfObjects 6230 5740 -7.9% 1.09x (?)
SetSymmetricDifferenceBox0 623 574 -7.9% 1.09x (?)
DataSetCountMedium 420 390 -7.1% 1.08x (?)
DataMutateBytesMedium 2080 1940 -6.7% 1.07x (?)

Code size: -Osize

Regression OLD NEW DELTA RATIO
CSVParsing.o 54840 56976 +3.9% 0.96x
BucketSort.o 11183 11423 +2.1% 0.98x
SortLettersInPlace.o 8259 8371 +1.4% 0.99x
 
Improvement OLD NEW DELTA RATIO
PrefixWhile.o 18472 17620 -4.6% 1.05x
DropWhile.o 19132 18268 -4.5% 1.05x
Prefix.o 19452 18588 -4.4% 1.05x
StringMatch.o 4572 4476 -2.1% 1.02x

Performance: -Onone

Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeStubDataAppend 4400 3220 -26.8% 1.37x
DataSetCountSmall 106 84 -20.8% 1.26x
EqualSubstringSubstringGenericEquatable 31 25 -19.4% 1.24x (?)
LessSubstringSubstringGenericComparable 31 25 -19.4% 1.24x
EqualSubstringSubstring 32 26 -18.7% 1.23x (?)
LessSubstringSubstring 32 26 -18.7% 1.23x
EqualStringSubstring 32 26 -18.7% 1.23x
EqualSubstringString 32 27 -15.6% 1.19x
SetSubtractingInt100 376 336 -10.6% 1.12x (?)
UTF8Decode_InitFromData_ascii 179 163 -8.9% 1.10x (?)
UTF8Decode_InitDecoding 136 124 -8.8% 1.10x (?)
DictionaryRemove 8550 7800 -8.8% 1.10x (?)
Set.filter.Int100.24k 2772 2558 -7.7% 1.08x
Set.filter.Int100.20k 2358 2180 -7.5% 1.08x (?)
Set.filter.Int100.16k 1989 1847 -7.1% 1.08x (?)
Set.filter.Int50.24k 1691 1575 -6.9% 1.07x (?)
Set.filter.Int100.28k 3633 3386 -6.8% 1.07x (?)
Set.filter.Int50.20k 1435 1339 -6.7% 1.07x

Code size: -swiftlibs

Improvement OLD NEW DELTA RATIO
libswiftSwiftOnoneSupport.dylib 180224 176128 -2.3% 1.02x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

@atrick
Copy link
Contributor

atrick commented Dec 4, 2019

@eeckstein
I originally wanted to merge the CGNode hasRC flag optimistically. We can still do that, but then we need a rule about which CGNodes can be safely merged. Currently, I think we could arbitrarily merge any (or all) content nodes and it would be conservatively correct, which is a nice property. So, this is a valid option, but does make the connection graph conceptually more complex.

canEscapeTo makes assumptions in two directions: how the connection graph was built and how the API is used, so let's look at both...

This PR assumes that CG nodes are all separate physical objects (from the perspective of SIL pointer conversion/casting/projections) except when the node corresponds to the properties of a class, in which case there is exactly one interior node. It might be possible to make that assumption here, it just isn't formalized in the representation (aside from the hasRC flag) or enforced, or even specified anywhere yet. Even if we prove that the graph is always built that way, we have to be sure that merging content nodes doesn't change the structure. If we get the same benefit without making such assumptions that's better because it's conceptually simpler and the escape analysis representation could be changed later without breaking things in confusing ways.

In the other direction, this PR assumes that 'V' will always be a "reference type" when its mapped CGNode represents any memory that might be reachable from a reference. Given all the ways that addresses, raw pointers and references can be converted in SIL it's hard for me to be convinced that it's true. I think that at least requires more clarity on the alias analysis side to ensure that it can always "see through" all an object's interior pointers and never "sees past" a reference type back to a non-reference pointer or address.

The the way I wanted to handle this case is just by making one obvious assumption in each direction:

  • a CGNode with the RC flag set always represents the start of a heap object (that's the whole point of the flag)

  • The value 'V' that alias analysis provides points to the same a physical object as any memory accesses that are being considered.

#28564

But, there's a problem with that approach. The question is whether the problem is specific to my approach or whether I'm just exposing it. I think the alias analysis query just needs to be adjusted. I'll explain the issue there.

@eeckstein eeckstein closed this Feb 14, 2020
@eeckstein
Copy link
Contributor Author

Already handled with Andy's escape analysis changes

@eeckstein eeckstein deleted the escape-analysis-use-points branch February 14, 2020 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants