-
Notifications
You must be signed in to change notification settings - Fork 10.5k
EscapeAnalysis cleanup and add utilities [nearly NFC] #28103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@eeckstein if you skim the diff side-by-side with the commit message, it should be quick to review. The commit message walks you through the diff. But if anything gets confusing just stop and ask me. These are mostly the things that tripped me up when trying to understand the code or debugging my attempts to change the code. |
@swift-ci test |
@swift-ci benchmark |
@swift-ci test source compatibility |
@eeckstein Actually, there is one important functional change that fell out of The functional change is that getPointerBase now sees through all of
This coalesces a whole bunch of nodes together that were just there |
Performance: -O
Code size: -OPerformance: -Osize
Code size: -OsizePerformance: -Onone
Code size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
@swift-ci smoke test |
This is the first in a series of patches that reworks EscapeAnalysis. For this patch, I extracted every change that does not introduce new features, rewrite logic, or appear to change functionality. These cleanups were done in preparation for: - adding a graph representation for reference counted objects - rewriting parts to the query logic - ...which then allows the analysis to safely assume that all exclusive arguments are unique - ...which then allows more aggressive optimization of local variables that don't escape There are two possible functional changes: 1. getUnderlyingAddressRoot in InstructionUtils now sees through OSSA instructions: begin_borrow and copy_value 2. The getPointerBase helper in EscapeAnalysis now sees through all of these reference and pointer casts: + case ValueKind::UncheckedRefCastInst: + case ValueKind::ConvertFunctionInst: + case ValueKind::UpcastInst: + case ValueKind::InitExistentialRefInst: + case ValueKind::OpenExistentialRefInst: + case ValueKind::RawPointerToRefInst: + case ValueKind::RefToRawPointerInst: + case ValueKind::RefToBridgeObjectInst: + case ValueKind::BridgeObjectToRefInst: + case ValueKind::UncheckedAddrCastInst: + case ValueKind::UnconditionalCheckedCastInst: + case ValueKind::RefTo##Name##Inst: + case ValueKind::Name##ToRefInst: This coalesces a whole bunch of nodes together that were just there because of casts. The existing code was already doing this for one level of casts, but there was a bug that prevented it from happening transitively. So, in theory, anything that breaks with this fix could also break without this fix, but may not have been exposed. The fact that this analysis coalesces address-to-reference casts at all is what caused me to spent vast amounts of time debugging any time I tried to force some structure on the graph via assertions. If it is done at all, it should be done everywhere consistently to expose issues as early as possible. Here is a description of the changes in diff order. If something in the diff is not listed here, then the code probably just moved around in the file: Rename isNotAliasedIndirectParameter to isExclusiveIndirectParameter. The argument may be aliased in the caller's scope and it's contents may have already escaped. Add comments to SILType APIs (isTrivial, isReferenceCounted) that give answers about the AST type which don't really make sense for address SILTypes. Add comments about CGNode's 'Value' field. I spent lots of time attempting to tighten this down with asserts, but it's only possible for non-content nodes. For content nodes, the node's value is highly unpredictable and basically nonsense but needed for debugging. Add comments about not assuming that the content nodes pointsTo edge represents physical indirection. This matters when reasoning about aliasing and it's a tempting assumption to make. Add a CGNode::mergeProperties placeholder for adding node properties. Factor out a CGNode::canAddDeferred helper for use later. Rename `setPointsTo` to `setPointsToEdge` because it actually creates an edge rather than just setting `pointsTo`. Add CGNode::getValue() and related helpers to help maintain invariants. Factor out a `markEscaping` helper. Clean up the `escapesInsideFunction` helper. Add node visitor helpers: visitSuccessors, visitDefers. This made is much easier to prototype utilities. Add comments to clarify the `pointsTo` invariant. In particular, an entire defer web may have a null pointsTo for a while. Add an `activeWorklist` to avoid nasty bugs when composing multiple helpers that each use the worklist. Remove the `EA` argument from `getNode`. I ended up needing access to the `EA` context from the ConnectionGraph many times during prototyping and passing `this` was all the `getNode` calls was very silly. Add graph visitor helpers: backwardTraverse, forwardTraverseDefer, forwardTraversePointsToEdges, and mayReach for ease in developing new logic and utilities. Add isExclusiveArgument helper and distinguish exclusive arguments from local objects. Confusing these properties could lead to scary bugs. For example, unlike a local object, an exclusive argument's contents may still escape even when the content's connection graph node is non-escaping! Add isUniquelyIdentified helper when we want to treat exclusive arguments and local objects uniformly. getUnderlyingAddressRoot now looks through OSSA instructions. Rename `getAccessedMemory` to `getDirectlyAccessedMemory` with comments. This is another dangerous API because it assumes the memory access to a given address won't touch memory represented by different graph nodes, but graph edges don't necessarily represent physical indirection. Further clarify this issue in comments in AliasAnalysis.cpp. Factor out a 'findRecursiveRefType' helper from the old 'mayContainReference' for checking whether types may or must contain references. Support both kinds of queries so the analysis can be certain when a pointer's content is a physical heap object. Factor out 'getPointerBase' and 'getPointerRoot' helpers that follow address projections within what EscapeAnalysis will consider a single node. Create a CGNodeWorklist abstraction to safely formalize the worklist mechanism that's used all over the place. In one place, there were even two separate independent lists used independently (nodes added to one list could appear to be in the other list). The CGNodeMap abstraction did not significantly change, it was just moved. Added 'dumpCG' for dumping .dot files making it possible to remote debug. Added '-escapes-enable-graphwriter' option to dump .dot files, since they are so much more useful than the textual dump of the connection graph, which lacks node identity!
@swift-ci smoke test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general LGTM.
See my comments inline
We should not coalesce address-to-reference casts in favour of keeping this nice invariants in the graph to distinguish between "object" and "address" nodes (as we talked some time ago). |
@eeckstein I can't really say whether we lose important information in practice when we treat "interesting" casts as unknown pointers. It looked like PointerToAddress and AddressToPointer were important, and I felt that if we handle those we should continue to handle the rest. I'm afraid that if we stop recognizing local ArrayBuffers, we might miss some optimization. Who knows where else it might be important in the future. I should have tried to eliminate cast handling first before adding information about heap objects to avoid time debugging issues where they kept breaking my verification. But now I have an implementation that supports the casts. The implementation isn't any more complicated, I just wasn't able to add as much self-checking. I want to make sure the changes I already have that may make some queries more conservative don't regress performance before adding other potential regressions on top of it. |
@swift-ci smoke test |
AnalyzeInstruction no longer needs to handle all these casts. getPointerBase does it now.
@swift-ci smoke test |
@atrick We can still handle most of the casts, like pointer_to_address and address_to_pointer. Only ref_to_raw_pointer/raw_pointer_to_ref is problematic. |
@eeckstein I have a few other EscapeAnalysis PRs to get through, then I can try removing support for merging address and ref type nodes. It's only useful if I can actually assert that we never mix address and ref types, but then I'm worried that assert will trigger in some rare situation we don't have tests for. It's hard for me to prove to myself that all possibilities are covered. For example:
|
This is the first in a series of patches that reworks
EscapeAnalysis. For this patch, I extracted every change that does not
introduce new features, rewrite logic, or appear to change
functionality.
These cleanups were done in preparation for:
There are two possible functional changes:
getUnderlyingAddressRoot in InstructionUtils now sees through OSSA
instructions: begin_borrow and copy_value
The getPointerBase helper in EscapeAnalysis now sees through all of
these reference and pointer casts:
This coalesces a whole bunch of nodes together that were just there
because of casts. The existing code was already doing this for one
level of casts, but there was a bug that prevented it from happening
transitively. So, in theory, anything that breaks with this fix could
also break without this fix, but may not have been exposed. The fact
that this analysis coalesces address-to-reference casts at all is what
caused me to spent vast amounts of time debugging any time I tried to
force some structure on the graph via assertions. If it is done at
all, it should be done everywhere consistently to expose issues as
early as possible.
Here is a description of the changes in diff order. If something in
the diff is not listed here, then the code probably just moved around
in the file:
Rename isNotAliasedIndirectParameter to
isExclusiveIndirectParameter. The argument may be aliased in the
caller's scope and it's contents may have already escaped.
Add comments to SILType APIs (isTrivial, isReferenceCounted) that give
answers about the AST type which don't really make sense for address
SILTypes.
Add comments about CGNode's 'Value' field. I spent lots of time
attempting to tighten this down with asserts, but it's only possible
for non-content nodes. For content nodes, the node's value is highly
unpredictable and basically nonsense but needed for debugging.
Add comments about not assuming that the content nodes pointsTo edge
represents physical indirection. This matters when reasoning about
aliasing and it's a tempting assumption to make.
Add a CGNode::mergeProperties placeholder for adding node properties.
Factor out a CGNode::canAddDeferred helper for use later.
Rename
setPointsTo
tosetPointsToEdge
because it actually createsan edge rather than just setting
pointsTo
.Add CGNode::getValue() and related helpers to help maintain invariants.
Factor out a
markEscaping
helper.Clean up the
escapesInsideFunction
helper.Add node visitor helpers: visitSuccessors, visitDefers. This made is
much easier to prototype utilities.
Add comments to clarify the
pointsTo
invariant. In particular, anentire defer web may have a null pointsTo for a while.
Add an
activeWorklist
to avoid nasty bugs when composing multiplehelpers that each use the worklist.
Remove the
EA
argument fromgetNode
. I ended up needing access tothe
EA
context from the ConnectionGraph many times duringprototyping and passing
this
was all thegetNode
calls was verysilly.
Add graph visitor helpers: backwardTraverse, forwardTraverseDefer,
forwardTraversePointsToEdges, and mayReach for ease in developing new
logic and utilities.
Add isExclusiveArgument helper and distinguish exclusive arguments
from local objects. Confusing these properties could lead to scary
bugs. For example, unlike a local object, an exclusive argument's
contents may still escape even when the content's connection graph
node is non-escaping!
Add isUniquelyIdentified helper when we want to treat exclusive
arguments and local objects uniformly.
getUnderlyingAddressRoot now looks through OSSA instructions.
Rename
getAccessedMemory
togetDirectlyAccessedMemory
withcomments. This is another dangerous API because it assumes the memory
access to a given address won't touch memory represented by different
graph nodes, but graph edges don't necessarily represent physical
indirection. Further clarify this issue in comments in
AliasAnalysis.cpp.
Factor out a 'findRecursiveRefType' helper from the old
'mayContainReference' for checking whether types may or must contain
references. Support both kinds of queries so the analysis can be
certain when a pointer's content is a physical heap object.
Factor out 'getPointerBase' and 'getPointerRoot' helpers that follow
address projections within what EscapeAnalysis will consider a single
node.
Create a CGNodeWorklist abstraction to safely formalize the worklist
mechanism that's used all over the place. In one place, there were
even two separate independent lists used independently (nodes added to
one list could appear to be in the other list).
The CGNodeMap abstraction did not significantly change, it was just moved.
Added 'dumpCG' for dumping .dot files making it possible to remote debug.
Added '-escapes-enable-graphwriter' option to dump .dot files, since
they are so much more useful than the textual dump of the connection
graph, which lacks node identity!