EscapeAnalysis: add node flags and change the meaning of "escaping" #28800

atrick · 2019-12-16T07:48:29Z

I've broken this PR loosely into commits. They can't quite be tested independently, but it does help for the sake of git log and annotate.

EscapeAnalysis: Add PointerKind and interior/reference flags

Categorize three kinds of pointers:

NoPointer (don't create a node)

ReferenceOnly (safe to make normal assumptions)

AnyPointer (may have addresses, rawpointers, or any mix of thoses
with references)

Flag ConnectionGraph nodes as
- hasReferenceOnly
- isInterior
An interior node always has an additional content node.

All sorts of arbitrary node merging is supported. Nodes with totally
different properties can be safely merged. Interior nodes can safely
be merged with their field content (which does happen surprisingly
often).

Alias analysis will use these flags to safely make assumptions about
properties of the connection graph.
EscapeAnalysis: Make EscapeState and UsePoints a property of the
content node only.

For alias analysis query to be generally correct, we need to
effectively merge the escape state and use points for everything in a
defer web.

It was unclear from the current design whether the "escaping" property
applied to the pointer value or its content. The implementation is
inconsistent in how it was treated. It appears that some bugs have
been worked around by propagating forward through defer edges, some
have been worked around by querying the content instead of the
pointer, and others have been worked around be creating fake use
points at block arguments.

If we always simply query the content for escape state and use points,
then we never need to propagate along defer edges. The current code
that propagates escape state along defer edges in one direction is
simply incorrect from the perspective of alias analysis.

One very attractive solution is to merge nodes eagerly without
creating any defer edges, but that would be a much more radical change
even than what I've done here. It would also pose some new issues: how
to resolve the current "node types" when merging and how to deal with
missing content nodes.

This solution of applying escape state to content nodes solves all
these problems without too radical of a change at the expense of
eagerly creating content nodes. (The potential graph memory usage is
not really an issue because it's possible to drastically shrink the
size of the graph anyway in a future commit--I've been able to fit a
node within one cache line). This solution nicely preserves graph
structure which makes it easy to debug and relate to the IR.

Eagerly creating content nodes also solves the missing content node
problem. For example, when querying canEscapeTo, we need to know
whether to look at the escape state for just the pointer value itself,
or also for its content. It may be possible the its content node is
actually part of the same object at the IR level. If the content node
is missing, then we don't know if the object's interior address is not
recognizable/representable or whether we simply never saw an access to
the interior address. We can't simply look at whether the current IR
value happens to be a reference, because that doesn't tell us whether
the graph node may have been merged with a non-reference node or even
with it's own content node. To be correct in general, this query would
need to be extremely conservative. However, if content nodes are
always created for references, then we only need to query the escape
state of a pointer's content node. The content node's flag tells us if
it's an interior node, in which case it will always point to another
content node which also needs to be queried.
EscapeAnalysis: Do not create defer edges for block arguemnts.

That appears to have been a partial workaround for the real
problem that usepoints need to be propagated across the entire
defer web. This is now solved by considering use points on the
reference node's content, not the reference node itself.
EscapeAnalysis: rewrite canEscapeToUsePoint.

Correctness: do not make any unenforced assumptions about how the
connection graph is built (I don't think the previous assumption about
the structure of the graph node mapped to a reference-type value would
always hold if content nodes can be arbitrarily merged). Only make one
assumption about the client code: the access being checked must be to
some address within the provided value, not another object indirectly
reachable from that value.

Optimization: Allow escape analysis to prove that an addressable
object does not escape even when one of its reference-type fields
escapes.
EscapeAnalysis: Add and update tests.

Fix tests for changes to EscapeAnalysis output
- New node flags
- Only content nodes are marked escaping
- Content nodes are eagerly created
- No defer edges for block args

atrick · 2019-12-16T07:48:38Z

@swift-ci test

atrick · 2019-12-16T07:48:48Z

@swift-ci test source compatibility

atrick · 2019-12-16T07:49:00Z

@swift-ci benchmark

swift-ci · 2019-12-16T08:11:36Z

Performance: -O

Regression	OLD	NEW	DELTA	RATIO
SuffixArray	4	9	+125.0%	0.44x
DropFirstArray	14	26	+85.7%	0.54x
DropWhileArray	23	35	+52.2%	0.66x (?)
ObjectAllocation	92	119	+29.3%	0.77x
NSStringConversion.Long	497	578	+16.3%	0.86x
Calculator	131	151	+15.3%	0.87x
FlattenListFlatMap	5118	5726	+11.9%	0.89x (?)
Chars2	3050	3400	+11.5%	0.90x
OpenClose	58	63	+8.6%	0.92x (?)

Improvement	OLD	NEW	DELTA	RATIO
PrefixAnySeqCRangeIterLazy	41	14	-65.9%	2.93x
PrefixAnySeqCntRangeLazy	41	14	-65.9%	2.93x
DropWhileAnySeqCntRange	36	15	-58.3%	2.40x
DropWhileAnySeqCRangeIter	36	15	-58.3%	2.40x
SuffixArrayLazy	9	4	-55.5%	2.25x
PrefixWhileAnyCollection	58	26	-55.2%	2.23x
PrefixAnySeqCRangeIter	39	23	-41.0%	1.70x (?)
DropFirstAnySeqCRangeIterLazy	35	22	-37.1%	1.59x
PrefixWhileAnySeqCRangeIterLazy	39	25	-35.9%	1.56x
DropFirstAnySeqCRangeIter	35	23	-34.3%	1.52x
DropFirstAnySeqCntRangeLazy	33	22	-33.3%	1.50x
DropFirstSequence	47	32	-31.9%	1.47x
DropFirstSequenceLazy	47	32	-31.9%	1.47x
DropWhileAnyCollection	38	26	-31.6%	1.46x
DropFirstAnySeqCntRange	32	23	-28.1%	1.39x
PrefixWhileArray	53	40	-24.5%	1.32x
DropWhileAnyCollectionLazy	63	48	-23.8%	1.31x (?)
DropWhileAnySeqCntRangeLazy	63	48	-23.8%	1.31x
DropFirstAnyCollection	31	25	-19.4%	1.24x
SuffixAnyCollection	12	10	-16.7%	1.20x
DropLastAnyCollection	12	10	-16.7%	1.20x
PrefixAnyCollection	31	26	-16.1%	1.19x
DropWhileAnySeqCRangeIterLazy	56	48	-14.3%	1.17x (?)
UTF8Decode_InitFromData_ascii	180	160	-11.1%	1.12x (?)
RemoveWhereMoveInts	20	18	-10.0%	1.11x (?)
Phonebook	1211	1099	-9.2%	1.10x (?)
ArrayLiteral2	76	69	-9.2%	1.10x (?)
ArrayAppendUTF16Substring	11412	10404	-8.8%	1.10x (?)
PrefixWhileAnyCollectionLazy	27	25	-7.4%	1.08x
PrefixWhileAnySeqCntRangeLazy	27	25	-7.4%	1.08x
UTF8Decode_InitFromData	131	122	-6.9%	1.07x (?)
UTF8Decode_InitFromBytes_ascii	221	206	-6.8%	1.07x (?)

Code size: -O

Regression	OLD	NEW	DELTA	RATIO
StringRemoveDupes.o	6990	7150	+2.3%	0.98x
CSVParsing.o	57756	58892	+2.0%	0.98x
ReduceInto.o	15346	15506	+1.0%	0.99x

Improvement	OLD	NEW	DELTA	RATIO
DropFirst.o	22578	20062	-11.1%	1.13x
DropWhile.o	20970	18710	-10.8%	1.12x
Prefix.o	21814	19794	-9.3%	1.10x
PrefixWhile.o	20064	19004	-5.3%	1.06x
DictTest3.o	21615	20943	-3.1%	1.03x
DictTest.o	19182	18670	-2.7%	1.03x
StringMatch.o	4752	4688	-1.3%	1.01x
DictTest2.o	14447	14303	-1.0%	1.01x

Performance: -Osize

Regression	OLD	NEW	DELTA	RATIO
DropWhileArray	24	35	+45.8%	0.69x
NSStringConversion.Long	492	590	+19.9%	0.83x
PrefixAnySeqCntRangeLazy	81	94	+16.0%	0.86x
PrefixWhileAnySeqCntRangeLazy	81	94	+16.0%	0.86x
ObjectAllocation	113	131	+15.9%	0.86x (?)
ArrayAppendSequence	590	680	+15.3%	0.87x (?)
DropWhileAnySeqCRangeIter	98	111	+13.3%	0.88x
NSStringConversion.Medium	236	266	+12.7%	0.89x (?)
StringBuilderLong	840	930	+10.7%	0.90x (?)
DropFirstAnySeqCRangeIterLazy	123	136	+10.6%	0.90x (?)
Dict.FilterAllMatch.28k	930	1012	+8.8%	0.92x (?)
Set.isSuperset.Seq.Empty.Int	47	51	+8.5%	0.92x (?)
ArrayAppendAscii	1258	1360	+8.1%	0.93x (?)
ArrayAppendLatin1	1258	1360	+8.1%	0.93x (?)
ArrayAppendUTF16	1258	1360	+8.1%	0.93x (?)

Improvement	OLD	NEW	DELTA	RATIO
DropLastArrayLazy	9	4	-55.5%	2.25x
DropLastCountableRangeLazy	9	6	-33.3%	1.50x
DropWhileAnySeqCntRangeLazy	184	126	-31.5%	1.46x
DropWhileAnyCollection	121	94	-22.3%	1.29x (?)
DropWhileAnySeqCRangeIterLazy	162	126	-22.2%	1.29x
DropFirstAnySeqCntRangeLazy	136	114	-16.2%	1.19x
SortIntPyramid	530	450	-15.1%	1.18x (?)
ArrayLiteral2	115	98	-14.8%	1.17x (?)
Phonebook	1274	1155	-9.3%	1.10x
DataCreateEmptyArray	1650	1500	-9.1%	1.10x (?)
DistinctClassFieldAccesses	211	193	-8.5%	1.09x (?)
DropWhileCountableRangeLazy	49	45	-8.2%	1.09x
UTF8Decode_InitFromBytes	141	131	-7.1%	1.08x (?)
UTF8Decode_InitFromData	133	124	-6.8%	1.07x (?)

Code size: -Osize

Regression	OLD	NEW	DELTA	RATIO
BucketSort.o	11183	11423	+2.1%	0.98x
SortLettersInPlace.o	8131	8243	+1.4%	0.99x

Improvement	OLD	NEW	DELTA	RATIO
PrefixWhile.o	18472	17716	-4.1%	1.04x
DropWhile.o	19132	18364	-4.0%	1.04x
Prefix.o	19452	18700	-3.9%	1.04x
StringMatch.o	4572	4476	-2.1%	1.02x

Performance: -Onone

Improvement	OLD	NEW	DELTA	RATIO
CharacterLiteralsLarge	491	449	-8.6%	1.09x (?)
SortStrings	1444	1325	-8.2%	1.09x (?)

Code size: -swiftlibs

How to read the data

The tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.

If you see any unexpected regressions, you should consider fixing the
regressions before you merge the PR.

Noise: Sometimes the performance results (not code size!) contain false
alarms. Unexpected regressions which are marked with '(?)' are probably noise.
If you see regressions which you cannot explain you can try to run the
benchmarks again. If regressions still show up, please consult with the
performance team (@eeckstein).

Hardware Overview

  Model Name: Mac mini
  Model Identifier: Macmini8,1
  Processor Name: Intel Core i7
  Processor Speed: 3.2 GHz
  Number of Processors: 1
  Total Number of Cores: 6
  L2 Cache (per Core): 256 KB
  L3 Cache: 12 MB
  Memory: 64 GB

swift-ci · 2019-12-16T08:57:28Z

Build failed
Swift Test OS X Platform
Git Sha - a5e5c410dc810c86f9f70e59e7f30a2221f3b709

swift-ci · 2019-12-16T09:04:29Z

Build failed
Swift Test Linux Platform
Git Sha - a5e5c410dc810c86f9f70e59e7f30a2221f3b709

eeckstein

Basically LGTM.
But I'm wondering if it's worth adding the complexity of may/must-have-reference.
I think it's better to strictly distinguish between reference and address-or-pointer nodes. That would mean that those two types of nodes are never merged. To achieve this it's probably enough to treat raw_pointer_to_ref conservatively.

include/swift/SILOptimizer/Analysis/EscapeAnalysis.h

eeckstein · 2019-12-16T08:40:13Z

include/swift/SILOptimizer/Analysis/EscapeAnalysis.h

+    /// base object for a SIL memory access, which may look through
+    /// ref_element_addr, if the access is to this node's pointsTo content, then
+    /// the access' base object must also be mapped to the pointsTo node, not
+    /// this node.


I can guess what "isInteriorFlag" means, but after reading this comment, I'm confused.

It's surprisingly hard to put into words:

/// True if this node's pointsTo content is part of the same object with /// respect to SIL memory access. When this is true, then this node must /// have a content node. /// /// When this flag is false, it provides a "separate access guarantee" for /// stronger analysis. If the base object for a SIL memory access is mapped /// to this node, then the accessed memory must have the same escaping /// properties as the base object. In contrast, when this is true, the /// connection graph may model the escaping properties of the base object /// separately from the accessed memory.

lib/SILOptimizer/Analysis/EscapeAnalysis.cpp

include/swift/SILOptimizer/Analysis/EscapeAnalysis.h

atrick · 2019-12-16T19:45:37Z

@eeckstein
Oops, I meant to eliminate the MustHavePointer kind. It didn't serve any purpose. So there are only three pointer "kinds".

Also, I think it's fine to call the "reference" kind ReferenceOnly, which matches the node flag. There's no need to distinguish between memory that may contain a reference vs. memory that always contains at least reference.

So, there are only three kinds of pointers:

NoPointer (don't create a node)

ReferenceOnly (safe to make normal assumptions)

AnyPointer (may have addresses, rawpointers, or any mix of thoses with references)

You might be suggesting that we have a NonReferencePointer instead, which we know doesn't contain references, but that doesn't simplify any of the analysis.

I don't know how to prevent different kinds of nodes from being merged. When I first started working on this I wanted to assert that merges always obeyed/preserved the graph properties. I only realized that was hopeless after a very long time debugging.

Maybe there's an alternate world where we only ever create content nodes for references. I'm not sure what we'd be giving up to gain that. We have bridge object casting, which is probably on the critical path, and rawpointer fields in objects that also have references. The easiest way for me to guarantee that we don't need to always assume worst case given the possibility of non-reference pointers is just to recognize the nodes that can't contain those bad things and conservatively flag and merge them. There's another issue of handling a null content node, where, without a flag, we'd always need to assume the worst. The missing node could be because the content was never accessed (optimistic) or because the pointer value contained or merged with some non-reference pointer value (pessimistic).

Categorize three kinds of pointers: NoPointer (don't create a node) ReferenceOnly (safe to make normal assumptions) AnyPointer (may have addresses, rawpointers, or any mix of thoses with references) Flag ConnectionGraph nodes as - hasReferenceOnly - isInterior An interior node always has an additional content node. All sorts of arbitrary node merging is supported. Nodes with totally different properties can be safely merged. Interior nodes can safely be merged with their field content (which does happen surprisingly often). Alias analysis will use these flags to safely make assumptions about properties of the connection graph.

…ent node only. For alias analysis query to be generally correct, we need to effectively merge the escape state and use points for everything in a defer web. It was unclear from the current design whether the "escaping" property applied to the pointer value or its content. The implementation is inconsistent in how it was treated. It appears that some bugs have been worked around by propagating forward through defer edges, some have been worked around by querying the content instead of the pointer, and others have been worked around be creating fake use points at block arguments. If we always simply query the content for escape state and use points, then we never need to propagate along defer edges. The current code that propagates escape state along defer edges in one direction is simply incorrect from the perspective of alias analysis. One very attractive solution is to merge nodes eagerly without creating any defer edges, but that would be a much more radical change even than what I've done here. It would also pose some new issues: how to resolve the current "node types" when merging and how to deal with missing content nodes. This solution of applying escape state to content nodes solves all these problems without too radical of a change at the expense of eagerly creating content nodes. (The potential graph memory usage is not really an issue because it's possible to drastically shrink the size of the graph anyway in a future commit--I've been able to fit a node within one cache line). This solution nicely preserves graph structure which makes it easy to debug and relate to the IR. Eagerly creating content nodes also solves the missing content node problem. For example, when querying canEscapeTo, we need to know whether to look at the escape state for just the pointer value itself, or also for its content. It may be possible the its content node is actually part of the same object at the IR level. If the content node is missing, then we don't know if the object's interior address is not recognizable/representable or whether we simply never saw an access to the interior address. We can't simply look at whether the current IR value happens to be a reference, because that doesn't tell us whether the graph node may have been merged with a non-reference node or even with it's own content node. To be correct in general, this query would need to be extremely conservative. However, if content nodes are always created for references, then we only need to query the escape state of a pointer's content node. The content node's flag tells us if it's an interior node, in which case it will always point to another content node which also needs to be queried.

That appears to have been a partial workaround for the real problem that usepoints need to be propagated across the entire defer web. This is now solved by considering use points on the reference node's content, not the reference node itself.

atrick · 2019-12-17T00:47:58Z

@swift-ci test

swift-ci · 2019-12-17T00:48:33Z

Build failed
Swift Test OS X Platform
Git Sha - a5e5c410dc810c86f9f70e59e7f30a2221f3b709

swift-ci · 2019-12-17T00:50:27Z

Build failed
Swift Test Linux Platform
Git Sha - a5e5c410dc810c86f9f70e59e7f30a2221f3b709

eeckstein · 2019-12-17T13:26:07Z

The type safety of SIL should ensure that we don't merge reference and address/pointer types. Of course, as long as we treat instructions, which convert between the both, conservatively.
But never mind, if you think it's not worth doing it, I'm fine with it.

atrick · 2019-12-17T18:54:38Z

@eeckstein right, we could treat all casts that change pointer type as a completely separate value. But that includes bridge object casts. I'm always nervous that we won't recognize some array/dictionary pattern. Also, before this we didn't have a solution for objects that contain both reference and raw pointer fields. At any rate, this way we don't give up any precision in the graph, we just lose the node flags which makes AliasAnalysis a bit more conservative in that case.

To be a little more clear: If we ever have a "content" node that represents a raw pointer, then we ca run into correctness issues if we don't have these flags. So, for example, if we ever load a raw pointer or load from an object containing a raw pointer then we end up with a content node that may be merged arbitrarily with other nodes. I'm operating under the rule that it should be safe to merge any two content nodes regardless of where they came from. Every time I thought I could make assumptions about what could be merged, I was proven wrong once I put the assertions in place.

Correctness: do not make any unenforced assumptions about how the connection graph is built (I don't think the previous assumption about the structure of the graph node mapped to a reference-type value would always hold if content nodes can be arbitrarily merged). Only make one assumption about the client code: the access being checked must be to some address within the provided value, not another object indirectly reachable from that value. Optimization: Allow escape analysis to prove that an addressable object does not escape even when one of its reference-type fields escapes.

Fix tests for changes to EscapeAnalysis output - New node flags - Only content nodes are marked escaping - Content nodes are eagerly created - No defer edges for block args

atrick · 2019-12-17T19:05:44Z

@swift-ci test

atrick · 2019-12-17T19:05:52Z

@swift-ci test source compatibility

swift-ci · 2019-12-17T19:07:50Z

Build failed
Swift Test Linux Platform
Git Sha - 8292c7dafa65af6df26f21f90c5a904e06cb9938

swift-ci · 2019-12-17T19:09:48Z

Build failed
Swift Test OS X Platform
Git Sha - 8292c7dafa65af6df26f21f90c5a904e06cb9938

eeckstein · 2019-12-17T19:44:29Z

okay

eeckstein approved these changes Dec 16, 2019

View reviewed changes

atrick mentioned this pull request Dec 16, 2019

[WIP] EscapeAnalysis: rewrite canEscapeToUsePoint. #28564

Closed

atrick added 3 commits December 16, 2019 16:43

atrick changed the title ~~[WIP] EscapeAnalysis: add node flags and change the meaning of "escaping"~~ EscapeAnalysis: add node flags and change the meaning of "escaping" Dec 17, 2019

atrick added 2 commits December 17, 2019 11:05

EscapeAnalysis: Add and update tests.

fa47f19

Fix tests for changes to EscapeAnalysis output - New node flags - Only content nodes are marked escaping - Content nodes are eagerly created - No defer edges for block args

atrick merged commit 3fdf6c8 into swiftlang:master Dec 18, 2019

eeckstein mentioned this pull request Dec 18, 2019

Revert "EscapeAnalysis: add node flags and change the meaning of "escaping"" #28848

Merged

atrick deleted the escapeflags branch December 23, 2019 03:03

EscapeAnalysis: add node flags and change the meaning of "escaping" #28800

EscapeAnalysis: add node flags and change the meaning of "escaping" #28800

Uh oh!

Conversation

atrick commented Dec 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

atrick commented Dec 16, 2019

Uh oh!

atrick commented Dec 16, 2019

Uh oh!

atrick commented Dec 16, 2019

Uh oh!

swift-ci commented Dec 16, 2019

Performance: -O

Code size: -O

Performance: -Osize

Code size: -Osize

Performance: -Onone

Code size: -swiftlibs

Uh oh!

swift-ci commented Dec 16, 2019

Uh oh!

swift-ci commented Dec 16, 2019

Uh oh!

eeckstein left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eeckstein Dec 16, 2019

Choose a reason for hiding this comment

Uh oh!

atrick Dec 16, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

atrick commented Dec 16, 2019

Uh oh!

atrick commented Dec 17, 2019

Uh oh!

swift-ci commented Dec 17, 2019

Uh oh!

swift-ci commented Dec 17, 2019

Uh oh!

eeckstein commented Dec 17, 2019

Uh oh!

atrick commented Dec 17, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

atrick commented Dec 17, 2019

Uh oh!

atrick commented Dec 17, 2019

Uh oh!

swift-ci commented Dec 17, 2019

Uh oh!

swift-ci commented Dec 17, 2019

Uh oh!

eeckstein commented Dec 17, 2019

Uh oh!

Uh oh!

atrick commented Dec 16, 2019 •

edited

Loading

atrick commented Dec 17, 2019 •

edited

Loading