Skip to content

EscapeAnalysis: make the use-point analysis more precise #27667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 15, 2019

Conversation

eeckstein
Copy link
Contributor

In canEscapeToUsePoint only check the content node if it's a reference (see comment why this is needed).
For all other node types, especially addresses, handle defer edges by propagating use-point infomation backward in the graph.
This makes escape analysis more precise with address types, e.g. don't consider an inout address to escape to an apply if just the loaded value is passed to an apply argument.

In canEscapeToUsePoint only check the content node if it's a reference (see comment why this is needed).
For all other node types, especially addresses, handle defer edges by propagating use-point infomation backward in the graph.
This makes escape analysis more precise with address types, e.g. don't consider an inout address to escape to an apply if just the loaded value is passed to an apply argument.
@eeckstein
Copy link
Contributor Author

@swift-ci test

@eeckstein
Copy link
Contributor Author

@swift-ci benchmark

@eeckstein eeckstein requested a review from atrick October 14, 2019 18:46
@swift-ci
Copy link
Contributor

Performance: -O

Regression OLD NEW DELTA RATIO
DropWhileArray 24 35 +45.8% 0.69x
Set.isStrictSubset.Seq.Int.Empty 110 123 +11.8% 0.89x
ObjectiveCBridgeStubFromNSDateRef 2400 2630 +9.6% 0.91x (?)
DictOfArraysToArrayOfDicts 580 624 +7.6% 0.93x (?)
 
Improvement OLD NEW DELTA RATIO
DropWhileAnySeqCntRange 49 15 -69.4% 3.27x
PrefixWhileAnyCollection 58 18 -69.0% 3.22x
DropWhileAnySeqCRangeIter 43 15 -65.1% 2.87x
PrefixAnySeqCRangeIterLazy 41 17 -58.5% 2.41x
PrefixAnySeqCntRangeLazy 41 17 -58.5% 2.41x
DropFirstAnySeqCRangeIterLazy 35 15 -57.1% 2.33x
DropFirstAnySeqCntRangeLazy 33 15 -54.5% 2.20x
DropFirstAnySeqCRangeIter 35 16 -54.3% 2.19x
DropFirstAnySeqCntRange 33 16 -51.5% 2.06x
DropWhileAnyCollection 38 19 -50.0% 2.00x
PrefixAnyCollection 31 18 -41.9% 1.72x
ObjectiveCBridgeStubDataAppend 4160 2720 -34.6% 1.53x
PrefixAnySeqCRangeIter 41 27 -34.1% 1.52x
Data.init.Sequence.64kB.Count0.RE 257 179 -30.4% 1.44x
Data.append.Sequence.64kB.Count0.RE.I 256 179 -30.1% 1.43x
Data.init.Sequence.64kB.Count0.RE.I 255 180 -29.4% 1.42x
DataSetCountSmall 82 58 -29.3% 1.41x
SetSymmetricDifferenceInt100 144 102 -29.2% 1.41x
Data.append.Sequence.64kB.Count0.RE 245 179 -26.9% 1.37x
PrefixWhileAnySeqCRangeIterLazy 34 25 -26.5% 1.36x
Data.init.Sequence.64kB.Count0 235 176 -25.1% 1.34x
Data.append.Sequence.64kB.Count0 236 177 -25.0% 1.33x
Data.init.Sequence.64kB.Count0.I 236 177 -25.0% 1.33x
Data.append.Sequence.64kB.Count0.I 235 177 -24.7% 1.33x
DropWhileAnySeqCRangeIterLazy 63 48 -23.8% 1.31x
Dictionary4 199 155 -22.1% 1.28x
Data.append.Sequence.809B.Count0.RE 366 286 -21.9% 1.28x
Data.append.Sequence.809B.Count0.RE.I 365 286 -21.6% 1.28x
Data.init.Sequence.809B.Count0.RE.I 416 329 -20.9% 1.26x
PrefixAnySeqCntRange 34 27 -20.6% 1.26x
Data.init.Sequence.2047B.Count0.I 431 344 -20.2% 1.25x
Data.init.Sequence.809B.Count0.RE 415 332 -20.0% 1.25x
Data.init.Sequence.2049B.Count0.I 432 346 -19.9% 1.25x
Data.init.Sequence.809B.Count0.I 408 329 -19.4% 1.24x
DropFirstAnyCollection 31 25 -19.4% 1.24x
Data.append.Sequence.809B.Count0.I 355 288 -18.9% 1.23x
Data.init.Sequence.809B.Count0 403 329 -18.4% 1.22x (?)
DictionarySwapAt 724 592 -18.2% 1.22x
SuffixAnyCollection 12 10 -16.7% 1.20x
DropLastAnyCollection 12 10 -16.7% 1.20x (?)
Data.append.Sequence.809B.Count0 350 293 -16.3% 1.19x
Data.init.Sequence.511B.Count0.I 434 364 -16.1% 1.19x
SetSymmetricDifferenceInt50 137 116 -15.3% 1.18x
DropWhileCountableRangeLazy 47 40 -14.9% 1.17x
NSStringConversion.Long 571 487 -14.7% 1.17x
Data.init.Sequence.513B.Count0.I 424 362 -14.6% 1.17x (?)
DropWhileAnyCollectionLazy 56 48 -14.3% 1.17x
DropWhileAnySeqCntRangeLazy 56 48 -14.3% 1.17x
Dictionary4OfObjects 229 199 -13.1% 1.15x
SetSymmetricDifferenceBox25 350 305 -12.9% 1.15x (?)
DropFirstAnySequence 2031 1774 -12.7% 1.14x
WordCountUniqueASCII 1440 1260 -12.5% 1.14x
Breadcrumbs.UTF16ToIdxRange.longASCII 25 22 -12.0% 1.14x
SetUnionBox25 275 244 -11.3% 1.13x
Set.subtracting.Empty.Box 9 8 -11.1% 1.12x
SetUnionBox0 411 366 -10.9% 1.12x
SetUnion_OfObjects 4110 3670 -10.7% 1.12x
WordCountUniqueUTF16 1620 1450 -10.5% 1.12x
SetExclusiveOr_OfObjects 4690 4260 -9.2% 1.10x (?)
SetSymmetricDifferenceBox0 469 427 -9.0% 1.10x (?)
NSStringConversion.Medium 259 236 -8.9% 1.10x (?)
DictionaryRemoveOfObjects 16000 14600 -8.7% 1.10x (?)
PrefixWhileAnyCollectionLazy 27 25 -7.4% 1.08x (?)
PrefixWhileAnySeqCntRangeLazy 27 25 -7.4% 1.08x (?)

Code size: -O

Regression OLD NEW DELTA RATIO
ObjectiveCBridgingStubs.o 18941 20285 +7.1% 0.93x
StringRemoveDupes.o 6990 7118 +1.8% 0.98x
 
Improvement OLD NEW DELTA RATIO
DropFirst.o 22224 19644 -11.6% 1.13x
DropWhile.o 20986 18710 -10.8% 1.12x
Prefix.o 21305 19205 -9.9% 1.11x
PrefixWhile.o 20080 18892 -5.9% 1.06x
DictTest3.o 21615 20943 -3.1% 1.03x
DictTest.o 19182 18654 -2.8% 1.03x
RangeReplaceableCollectionPlusDefault.o 5868 5772 -1.6% 1.02x
StringMatch.o 4688 4624 -1.4% 1.01x
DictTest2.o 14447 14303 -1.0% 1.01x

Performance: -Osize

Regression OLD NEW DELTA RATIO
SuffixArrayLazy 5 9 +80.0% 0.56x
DropWhileArray 25 35 +40.0% 0.71x
PrefixCountableRangeLazy 13 18 +38.5% 0.72x (?)
PrefixWhileAnySeqCntRangeLazy 81 108 +33.3% 0.75x
PrefixWhileArray 40 53 +32.5% 0.75x
DropLastAnyCollection 32 41 +28.1% 0.78x
PrefixWhileAnyCollectionLazy 81 94 +16.0% 0.86x (?)
ObjectiveCBridgeStubNSDateRefAccess 174 196 +12.6% 0.89x (?)
ObjectiveCBridgeStubFromNSDateRef 2290 2500 +9.2% 0.92x (?)
 
Improvement OLD NEW DELTA RATIO
PrefixWhileCountableRangeLazy 26 13 -50.0% 2.00x
PrefixArray 26 13 -50.0% 2.00x
DataSetCountSmall 85 58 -31.8% 1.47x
ObjectiveCBridgeStubDataAppend 4120 2920 -29.1% 1.41x
DropWhileAnySeqCntRangeLazy 170 121 -28.8% 1.40x
Data.append.Sequence.64kB.Count0.I 232 170 -26.7% 1.36x
Data.init.Sequence.64kB.Count0.RE 251 187 -25.5% 1.34x
Data.append.Sequence.64kB.Count0 228 170 -25.4% 1.34x (?)
DropWhileAnySeqCRangeIterLazy 162 121 -25.3% 1.34x
Data.init.Sequence.64kB.Count0 232 174 -25.0% 1.33x
PrefixAnySeqCntRangeLazy 108 81 -25.0% 1.33x
Data.append.Sequence.64kB.Count0.RE.I 249 188 -24.5% 1.32x
Data.init.Sequence.64kB.Count0.I 228 173 -24.1% 1.32x
Data.append.Sequence.64kB.Count0.RE 247 188 -23.9% 1.31x
Data.init.Sequence.64kB.Count0.RE.I 246 188 -23.6% 1.31x
Data.append.Sequence.809B.Count0.RE 378 292 -22.8% 1.29x
Data.append.Sequence.809B.Count0.RE.I 378 295 -22.0% 1.28x
Data.append.Sequence.809B.Count0 348 273 -21.6% 1.27x
Data.init.Sequence.809B.Count0.RE.I 426 336 -21.1% 1.27x
Data.init.Sequence.809B.Count0.RE 426 338 -20.7% 1.26x
Data.append.Sequence.809B.Count0.I 347 276 -20.5% 1.26x (?)
Set.isStrictSubset.Seq.Box0 1245 991 -20.4% 1.26x
Set.isStrictSuperset.Seq.Box0 12410 9879 -20.4% 1.26x
Set.isSubset.Seq.Box0 1246 992 -20.4% 1.26x
Set.isStrictSuperset.Seq.Box25 1251 996 -20.4% 1.26x
Data.init.Sequence.2047B.Count0.I 421 339 -19.5% 1.24x
Data.init.Sequence.2049B.Count0.I 420 341 -18.8% 1.23x
DictionarySwapAt 732 600 -18.0% 1.22x
Data.init.Sequence.809B.Count0.I 395 324 -18.0% 1.22x
Set.isSubset.Seq.Box25 1442 1186 -17.8% 1.22x (?)
Set.isStrictSubset.Seq.Box25 1441 1186 -17.7% 1.22x (?)
Data.init.Sequence.809B.Count0 388 323 -16.8% 1.20x
SuffixCountableRangeLazy 6 5 -16.7% 1.20x (?)
Data.init.Sequence.513B.Count0.I 417 356 -14.6% 1.17x (?)
NSStringConversion.Long 567 487 -14.1% 1.16x (?)
Data.init.Sequence.511B.Count0.I 415 357 -14.0% 1.16x (?)
Set.intersection.Seq.Box0 368 318 -13.6% 1.16x
Set.intersection.Seq.Box25 479 416 -13.2% 1.15x (?)
DataMutateBytesSmall 160 140 -12.5% 1.14x
WordCountUniqueASCII 1440 1270 -11.8% 1.13x (?)
PrefixWhileCountableRange 17 15 -11.8% 1.13x (?)
SetUnionBox25 329 294 -10.6% 1.12x
WordCountUniqueUTF16 1630 1460 -10.4% 1.12x
PrefixWhileAnyCollection 126 113 -10.3% 1.12x
SetUnion_OfObjects 4700 4220 -10.2% 1.11x
RemoveWhereMoveInts 20 18 -10.0% 1.11x (?)
SetSymmetricDifferenceBox25 494 445 -9.9% 1.11x (?)
SetUnionBox0 469 423 -9.8% 1.11x (?)
NSStringConversion.Medium 258 235 -8.9% 1.10x (?)
DropWhileCountableRangeLazy 49 45 -8.2% 1.09x (?)
SetSymmetricDifferenceBox0 622 574 -7.7% 1.08x (?)
DataSetCountMedium 410 380 -7.3% 1.08x (?)
DataMutateBytesMedium 2080 1940 -6.7% 1.07x (?)
DataAppendArray 3000 2800 -6.7% 1.07x (?)
DropWhileAnyCollection 121 113 -6.6% 1.07x (?)

Code size: -Osize

Regression OLD NEW DELTA RATIO
BucketSort.o 11183 11423 +2.1% 0.98x
SortLettersInPlace.o 8259 8371 +1.4% 0.99x
 
Improvement OLD NEW DELTA RATIO
PrefixWhile.o 18520 17588 -5.0% 1.05x
DropWhile.o 19164 18220 -4.9% 1.05x
Prefix.o 19436 18508 -4.8% 1.05x
StringMatch.o 4524 4428 -2.1% 1.02x

Performance: -Onone

Improvement OLD NEW DELTA RATIO
ObjectiveCBridgeStubDataAppend 4440 3220 -27.5% 1.38x
DataSetCountSmall 111 82 -26.1% 1.35x
SetSubtractingInt100 378 334 -11.6% 1.13x
DictionaryRemove 8710 7720 -11.4% 1.13x (?)
SetSymmetricDifferenceInt100 526 478 -9.1% 1.10x (?)
SetSymmetricDifferenceInt50 528 485 -8.1% 1.09x (?)
Set.filter.Int100.24k 2731 2510 -8.1% 1.09x (?)
Set.filter.Int100.20k 2319 2140 -7.7% 1.08x (?)
Set.filter.Int100.16k 1962 1811 -7.7% 1.08x (?)
SetSymmetricDifferenceInt25 532 492 -7.5% 1.08x (?)
SetSubtractingInt50 312 289 -7.4% 1.08x (?)
Set.filter.Int100.28k 3583 3320 -7.3% 1.08x (?)
CharacterLiteralsLarge 490 457 -6.7% 1.07x (?)

Code size: -swiftlibs

Improvement OLD NEW DELTA RATIO
libswiftSwiftOnoneSupport.dylib 180224 176128 -2.3% 1.02x
How to read the data The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview
  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

@eeckstein eeckstein merged commit 4f4557c into swiftlang:master Oct 15, 2019
@eeckstein eeckstein deleted the escape-analysis branch October 15, 2019 06:39
Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to see a unit test for the optimization(s) that you are unblocking. I need to know if the changes I want to make for robustness will interfere with that optimization. In this PR you only have incidental changes to the unit tests. It's not clear that those changes are essential to the test. So if someone changes anything later that breaks your expectations it won't be clear.

// ... except it has defer predecessor edges, which are (potentially) not
// escaping: use-points are propagated in reverse direction along defer
// edges!
!Node->hasDeferPredecessors()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be fixing an existing correctness issue. Is that right?

Copy link
Contributor Author

@eeckstein eeckstein Oct 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, There was not correctness issue.
There is (better: was) no point in adding use-points for escaping nodes because escaping nodes would cause to bail anyway. Only now, when propagating in reverse direction, we have to consider this, because a predecessor might not be escaping even if the node itself is escaping.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying. But if the direction of the defer edge matters, then I think phi arguments needed defer edges in both directions. Or, we just say that the direction doesn't matter and propagate both escape states and use points in both directions.

If the direction does not matter, then we could probably just merge the nodes in the defer web.

CGNode *ContentNode = ConGraph->getContentNode(Node);
if (ContentNode->escapesInsideFunction(false))
return true;
if (V->getType().hasReferenceSemantics()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this check is fine for now. I don't think it's robust to assume here that all other pointsTo edges represent physical indirection (which is assumed by the caller of this API) when that fact isn't represented anywhere in the graph. That assumes something about how the graph is constructed and assumes that nothing unforeseen has happened to the graph in the mean time. In debugging escape graphs, I've noticed many scenarios that I never foresaw.

Very soon I can commit a change that marks nodes that start a heap object. I'm not sure that will be sufficient to handle whatever case you need optimized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, checking the new flag what you'll add, will be better and more robust.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic problem with this API is that the client has some specification for an "object" that doesn't match any concept in the ConnectionGraph. This is one of the key issues that I was addressing on my working branch before I put it on hold to work on bugs and benchmarks.

Can you tell me what test broke with this change so I can see if I'm also breaking it?

I wonder if it would be sufficient for you to simply do this check instead:
If the SILValue's node is a 'Value' node type, then it can't represent the content of a heap object. So, in that case, you should never need to look at it's content node.

When checking for a reference type. I'm afraid that getUnderlyingObject might look through an address to reference cast.

@eeckstein
Copy link
Contributor Author

Thanks for reviewing! I'll add a test case which shows the optimization problem in a follow-up commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants