Skip to content

Commit 9627a05

Browse files
committed
Document EscapeAnalysis.
Describe the algorithm in the file-level doc comment. The basic algorithm in the referenced paper is similar, but the most interesting/important information is how it is adapted to SIL.
1 parent c8a16d3 commit 9627a05

File tree

2 files changed

+162
-12
lines changed

2 files changed

+162
-12
lines changed

include/swift/SILOptimizer/Analysis/EscapeAnalysis.h

Lines changed: 157 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,140 @@
99
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
1010
//
1111
//===----------------------------------------------------------------------===//
12+
///
13+
/// EscapeAnalysis provides information about whether the lifetime of an object
14+
/// exceeds the scope of a function.
15+
///
16+
/// We compute escape analysis by building a connection graph for each
17+
/// function. For interprocedural analysis the connection graphs are merged
18+
/// in bottom-up order of the call graph.
19+
/// The idea is based on "Escape analysis for Java." by J.-D. Choi, M. Gupta, M.
20+
/// Serrano, V. C. Sreedhar, and S. Midkiff
21+
/// http://dx.doi.org/10.1145/320384.320386
22+
///
23+
/// This design is customized for SIL and the Swift memory model as follows:
24+
///
25+
/// Each SILValue holding a memory address or object reference is mapped to a
26+
/// node in the connection graph. The node's type depends on the value's
27+
/// origin. SILArguments have "argument" type. Locally allocated storage and
28+
/// values of unknown origin have "value" type. Loaded values have "content"
29+
/// type. A "return" type node represents the returned value and has no
30+
/// associated SILValue.
31+
///
32+
/// "Content" nodes are special in that they represent the identity of some set
33+
/// of memory locations. Content nodes are created to represent the memory
34+
/// pointed to by one of the other node types. So, except for loads, SILValues
35+
/// do not directly map to content nodes. For debugging purposes only, content
36+
/// nodes do refer back to the SILValue that originally pointed to them. When
37+
/// content nodes are merged, only one of those SILValue back-references is
38+
/// arbitrarily preserved. The content of the returned value is the only content
39+
/// node that has no back-reference to a SILValue.
40+
///
41+
/// This code:
42+
/// let a = SomeClass()
43+
/// return a
44+
///
45+
/// Generates the following connection graph, where 'a' is in the SILValue %0:
46+
/// Val %0 Esc: R, Succ: (%0.1) // Represents 'a', and points to 'a's content
47+
/// Con %0.1 Esc: G, Succ: // Represents the content of 'a'
48+
/// Ret Esc: R, Succ: %0 // The returned value, aliased with 'a'
49+
///
50+
/// Each node has an escaping state: None, (R)eturn, (A)rguments, or (G)lobal.
51+
/// These states form a lattice in which None is the most refined, or top, state
52+
/// and Global is the least refined, or bottom, state. Merging nodes performs a
53+
/// meet operation on their escaping states. At a call site, the callee graph is
54+
/// merged with the callee graph by merging the respective call argument
55+
/// nodes. A node has a "Return" escaping state if it only escapes by being
56+
/// returned from the current function. A node has an "Argument" escaping state
57+
/// if only escapes by being passed as an incoming argument to this function.
58+
///
59+
/// A directed edge between two connection graph nodes indicates that the memory
60+
/// represented by the destination node memory is reachable via an address
61+
/// contained in the source node. A node may only have one "pointsTo" edge,
62+
/// whose destination is always a content node. Additional "defer" edges allow a
63+
/// form of aliasing between nodes. A single content node represents any and all
64+
/// memory that any other node may point to. This content node can be found by
65+
/// following any path of defer edges until the path terminates in a pointsTo
66+
/// edge. The final pointsTo edge refers to the representative content node, and
67+
/// all such paths in the graph must reach the same content node. To maintain
68+
/// this invariant, the algorithm that builds the connection graph must
69+
/// incrementally merge content nodes.
70+
///
71+
/// Note that a defer edge may occur between any node types. A value node that
72+
/// holds a reference may defer to another value or content node whose value was
73+
/// merged via a phi; a content node that holds a reference may defer to a value
74+
/// node that was stored into the content; a content node may defer to another
75+
/// content node that was loaded and stored.
76+
///
77+
/// Now consider the same example, but declaring a 'var' instead of a 'let':
78+
///
79+
/// var a = SomeClass()
80+
/// return a
81+
///
82+
/// Generates the following connection graph, where the alloc_stack for variable
83+
/// 'a' is in the SILValue %0 and class allocation returns SILValue %3.
84+
/// Val %0 Esc: G, Succ: (%0.1)
85+
/// Con %0.1 Esc: G, Succ: %3
86+
/// Val %3 Esc: G, Succ: (%3.1)
87+
/// Con %3.1 Esc: G, Succ:
88+
/// Ret Esc: R, Succ: %3
89+
///
90+
/// The value node for variable 'a' now points to local variable storage
91+
/// (%0.1). That local variable storage contains a reference. Assignment into
92+
/// that reference creates a defer edge to the allocated reference (%3). The
93+
/// allocated reference in turn points to the object storage (%3.1).
94+
///
95+
/// Note that a variable holding a single class reference and a variable
96+
/// holding a non-trivial struct has the same graph representation. The
97+
/// variable's content node only represents the value of the references, not the
98+
/// memory pointed-to by the reference.
99+
///
100+
/// A pointsTo edge does not necessarily indicate pointer indirection. It may
101+
/// simply represent a derived address within the same object. This allows
102+
/// escape analysis to view an object's memory in layers, each with separate
103+
/// escaping properties. For example, a class object's first-level content node
104+
/// represents the object header including the metadata pointer and reference
105+
/// count. An object's second level content node only represents the
106+
/// reference-holding fields within that object. Consider the connection graph
107+
/// for a class with properties:
108+
///
109+
/// class HasObj {
110+
/// var obj: AnyObject
111+
/// }
112+
/// func assignProperty(h: HasObj, o: AnyObject) {
113+
/// h.obj = o
114+
/// }
115+
///
116+
/// Which generates this graph where the argument 'h' is %0, and 'o' is %1:
117+
/// Arg %0 Esc: A, Succ: (%0.1)
118+
/// Con %0.1 Esc: A, Succ: (%0.2)
119+
/// Con %0.2 Esc: A, Succ: %1
120+
/// Arg %1 Esc: A, Succ: (%1.1)
121+
/// Con %1.1 Esc: A, Succ: (%1.2)
122+
/// Con %1.2 Esc: G, Succ:
123+
///
124+
/// Node %0.1 represents the header of 'h', including reference count and
125+
/// metadata pointer. This node points to %0.2 which represents the 'obj'
126+
/// property. The assignment 'h.obj = o' creates a defer edge from %0.2 to
127+
/// %1. Similarly, %1.1 represents the header of 'o', and %1.2 represents any
128+
/// potential nontrivial properties in 'o' which may have escaped globally when
129+
/// 'o' was released.
130+
///
131+
/// The connection graph is constructed by summarizing all memory operations in
132+
/// a flow-insensitive way. Hint: ConGraph->viewCG() displays the Dot-formatted
133+
/// connection graph.
134+
///
135+
/// In addition to the connection graph, EscapeAnalysis stores information about
136+
/// "use points". Each release operation is a use points. These instructions are
137+
/// recorded in a table and given an ID. Each connection graph node stores a
138+
/// bitset indicating the use points reachable via the CFG by that node. This
139+
/// provides some flow-sensitive information on top of the otherwise flow
140+
/// insensitive connection graph.
141+
///
142+
/// Note: storing bitsets in each node may be unnecessary overhead since the
143+
/// same information can be obtained with a graph traversal, typically of only
144+
/// 1-3 hops.
145+
// ===---------------------------------------------------------------------===//
12146

13147
#ifndef SWIFT_SILOPTIMIZER_ANALYSIS_ESCAPEANALYSIS_H_
14148
#define SWIFT_SILOPTIMIZER_ANALYSIS_ESCAPEANALYSIS_H_
@@ -28,15 +162,9 @@ namespace swift {
28162

29163
class BasicCalleeAnalysis;
30164

31-
/// The EscapeAnalysis provides information if the lifetime of an object exceeds
32-
/// the scope of a function.
33-
///
34-
/// We compute the escape analysis by building a connection graph for each
35-
/// function. For the interprocedural analysis the connection graphs are merged
36-
/// in bottom-up order of the call graph.
37-
/// The idea is based on "Escape analysis for Java." by J.-D. Choi, M. Gupta, M.
38-
/// Serrano, V. C. Sreedhar, and S. Midkiff
39-
/// http://dx.doi.org/10.1145/320384.320386
165+
/// The EscapeAnalysis results for functions in the current module, computed
166+
/// bottom-up in the call graph. Each function with valid EscapeAnalysis
167+
/// information is associated with a ConnectionGraph.
40168
class EscapeAnalysis : public BottomUpIPAnalysis {
41169

42170
/// The types of edges in the connection graph.
@@ -169,8 +297,21 @@ class EscapeAnalysis : public BottomUpIPAnalysis {
169297
NodeType Type;
170298

171299
/// The constructor.
172-
CGNode(ValueBase *V, NodeType Type) :
173-
V(V), UsePoints(0), Type(Type) { }
300+
CGNode(ValueBase *V, NodeType Type) : V(V), UsePoints(0), Type(Type) {
301+
switch (Type) {
302+
case NodeType::Argument:
303+
case NodeType::Value:
304+
assert(V);
305+
break;
306+
case NodeType::Return:
307+
assert(!V);
308+
break;
309+
case NodeType::Content:
310+
// A content node representing the returned value has no associated
311+
// SILValue.
312+
break;
313+
}
314+
}
174315

175316
/// Merges the state from another state and returns true if it changed.
176317
bool mergeEscapeState(EscapeState OtherState) {
@@ -452,7 +593,11 @@ class EscapeAnalysis : public BottomUpIPAnalysis {
452593
/// Returns null, if V is not a "pointer".
453594
CGNode *getNode(ValueBase *V, EscapeAnalysis *EA, bool createIfNeeded = true);
454595

455-
/// Gets or creates a content node to which \a AddrNode points to.
596+
/// Gets or creates a content node to which \a AddrNode points to during
597+
/// initial graph construction. This may not be called after defer edges
598+
/// have been created. Doing so would break the invariant that all
599+
/// non-content nodes ultimately have a pointsTo edge to a single content
600+
/// node.
456601
CGNode *getContentNode(CGNode *AddrNode);
457602

458603
/// Get or creates a pseudo node for the function return value.

lib/SILOptimizer/Analysis/EscapeAnalysis.cpp

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1413,8 +1413,13 @@ void EscapeAnalysis::analyzeInstruction(SILInstruction *I,
14131413
// the object itself (because it will be a dangling pointer after
14141414
// deallocation).
14151415
CGNode *CapturedByDeinit = ConGraph->getContentNode(AddrNode);
1416+
// Get the content node for the object's properties. The object header
1417+
// itself cannot escape from the deinit.
14161418
CapturedByDeinit = ConGraph->getContentNode(CapturedByDeinit);
14171419
if (deinitIsKnownToNotCapture(OpV)) {
1420+
// Presumably this is necessary because, even though the deinit
1421+
// doesn't escape the immediate properties of this class, it may
1422+
// indirectly escape some other memory content(?)
14181423
CapturedByDeinit = ConGraph->getContentNode(CapturedByDeinit);
14191424
}
14201425
ConGraph->setEscapesGlobal(CapturedByDeinit);

0 commit comments

Comments
 (0)