-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[SSADestroyHoisting] Handle loops. #58791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
611c325
to
0097628
Compare
0097628
to
cf8a779
Compare
cf8a779
to
769209b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice. I was able to read the code top-to-bottom, proving correctness at each step. The next stage in evolution will be to somehow factor all this to support forward reachability.
I would have moved the implementation out of the class definition, but that's just my style. I like to quickly see the class members and public API in one place.
Somewhere, we need to define what it means for an instruction to be both a gen and a kill. Presumably, the gen takes priority for backward traversal, but I don't think that's the current implementation.
bool setEndState(SILBasicBlock *block, State state) {
You could comment that this returns true if the state changes. It's description and code is complicated enough that it's not 100% obvious.
> /// NOTE: Initialization occurs when stage == Stage::Initializing. During
> /// that stage, it is legal to set an end state to ::Unknown, despite
> /// the fact that it was (implicitly, by not being recorded in either
> /// set of blocks) previously in the ::Reachable state.
Rather than confusing people with all this explanation and adding extra assertion logic, you could just not call this during initialization. This API actually makes initialization more complex as opposed to just directly calling result.unknownEndBlocks.insert(block);
> auto previouslyUnknown = result.unknownEndBlocks.contains(block);
> result.unknownEndBlocks.erase(block);
I don't know why BasicBlockSet::insert
uses testAndSet
but BasicBlockSet::erase
does not. @eeckstein I suggest that BasicBlock::erase
return a bool
for readability and consistency with erase
in other familier data types. A separate BasicBlockSet::reset
can be used for maximum efficiency.
> #ifndef NDEBUG
> auto previouslyUnreachable = result.unreachableEndBlocks.contains(block);
> result.unreachableEndBlocks.erase(block);
> // Transitioning up the lattice from unreachable to reachable is not
> // allowed. If it were, reaching a fixed point would not be guaranteed.
> assert(!previouslyUnreachable);
> return previouslyUnknown || previouslyUnreachable;
> #else
> return previouslyUnknown;
> #endif
Please replace both cases of this pattern with:
assert(!result.unreachableEndBlocks.contains(block));
return previouslyUnknown;
> Optional<Effect> findLocalEffect(SILInstruction *from,
> Optional<Optional<Effect>> initialEffect,
> Optional<Effect> &firstEffect) {
Double Optionals should be banned. Full stop. But...
- There's no reason to have an optional enum in the first place. "Optionality" should be part of the enum state: None, Kill, Gen.
Optionals should only be used to indicate that an API failed to compute the answer. For dataflow 'NoEffect is a valid effect that should not map to a missing value!
- The
initialEffect
is simpler to handle on the caller side. findLocalEffect should simply find the effect of the region of interest. The caller should then apply that effect to the initial value if it has one (that can be a wrapper of course).
You don't even really need firstEffect
. A findPrevKill(inst)
check is really what you want. Except for addKill()
, where you only need a findPrevGen()
. Those implementations would be trivial. It's better to duplicate a couple trivial lines of code than to add parameters and conditionalize implementation.
Instead of...
> void initializeFromGen(SILInstruction *gen) {
> //...
> auto summaryEffect = findLocalEffect(gen, {Effect::Gen}, firstEffect);
> if (isKill(firstEffect)) {
> localGens.push_back(gen);
> if (isKill(summaryEffect))
> result.killBlocks.insert(block);
you could have...
// previous gens get prioritized...
bool recordKill(SILBasicBlock *block) {
if (!result.genBlocks.contains(block)) {
// If the block wasn't previously a gen block, we immediately know that
// adding a kill to the block makes it a kill block.
result.killBlocks.insert(block);
return true;
}
return false;
}
so initializeFromGen is simplified as...
void initializeFromGen(SILInstruction *gen) {
//...
if (findPrevKill(gen)) {
localGens.push_back(gen);
recordKill(block);
}
And addKill gets simplified to...
void addKill(SILInstruction *instruction) {
auto *block = instruction->getParent();
if (!recordKill(block) && !findPrevGen(instruction))
setEffect(Effect::Kill, block);
Then instead of findLocalEffects(SILBasicBlock *)
you could have:
Effect initLocalEffect(SILBasicBlock *block) {
if (genBlocks.contains(block))
return Effects:Gen;
if (findPrevKill(block)) {
result.killBlocks.insert(block);
return Effects::Kill;
}
return Effects:NoEffect;
> void propagateIntoPredecessors(SILBasicBlock *successor) {
> ...
> for (auto *predecessor : successor->getPredecessorBlocks()) {
> if (!result.discoveredBlocks.contains(predecessor))
> continue;
It's not clear to me how global data flow can propagate beyond discovered blocks. Should that be an assert?
> bool findBarrier(SILInstruction *from, SILBasicBlock *block,
>...
> if (*effect == Effect::Gen) {
> assert(false && "found gen (before kill) in reachable block");
Why can't a gen be reachable? There's no joint-post-dominating gen guarantee.
> void initialize() {
> ...
> // This block is a kill, so it is unreachable at the begin. Propagate
> // that unreachable-at-begin-ness to unreachable-at-begin-ness of its
"to unreachable-at-end-ness"
> for (auto iterator = result.discoveredBlocks.begin();
> iterator != result.discoveredBlocks.end(); ++iterator) {
Use an index to avoid iterator invalidation.
Why can't a gen be reachable? There's no joint-post-dominating gen guarantee.
> void initialize() {
> ...
> // This block is a kill, so it is unreachable at the begin. Propagate
> // that unreachable-at-begin-ness to unreachable-at-begin-ness of its
"to unreachable-at-end-ness"
> for (auto iterator = result.discoveredBlocks.begin();
> iterator != result.discoveredBlocks.end(); ++iterator) {
> void findBarriers(SmallSetVector<SILInstruction *, 4> &barrierInstructions,
> SmallSetVector<SILBasicBlock *, 4> &barrierPhis,
> SmallSetVector<SILBasicBlock *, 4> &barrierBlocks) {
I'm not a fan of APIs that take multiple by-ref data structures. I would package that up as struct Barriers
.
We need to define "barrier block" somewhere. It's non-obviuos.
1af71be
to
281bfc5
Compare
@atrick, thank you again for your review! There's now a new version of Reachability.h to which most of your suggestions have been applied.
Done. Implementations along with longer comments are now at the end of Reachability, grouped according to usage. Important types for clients and relevant member functions are listed first, followed by implementation details.
Done.
The As discussed, there is a separate initializeBeginStateToUnknown. Both of these are now member functions on IterativeBlockReachability::Result. The IterativeBackwardReachability::Result class is now solely responsible for dealing with tracking blocks' effects and states. Reachability itself just gets and sets states and effects for blocks, without concern for how they happen to be stored.
Done.
Done. IterativeBackwardReachability::Effect now has three cases:
The variants of findLocalEffect have been replaced with:
No--at least notot without other changes. The point of this continue is to avoid propagating beyond discovered blocks. Blocks can be added to the worklist even if their predecessors aren't reachable. Blocks are added to the worklist whenever they're found to be unreachable-at-begin; there is no check that all of its predecessors are reachable. A block is added to the worklist whenever its unreachable-at-begin. But its predecessors aren't marked as discovered if the block is a kill or if it's the def block.
Fixed. Removed the assert.
Fixed.
Is this a problem for BasicBlockSetVector's iterator aka StackList's iterator? The iterator looks to be a wrapper around an index and the slab being indexed into. Can that be invalidated by growing the StackList? If it is, BasicBlockSetVector doesn't seem to support accessing elements by index.
The findBarriers member function now takes a visitor whose three member functions are called to visit barrier instructions, blocks, and phis.
Done. In the doc for findBarriers. |
@swift-ci please test |
@swift-ci please test source compatibility |
@nate-chandler Awesome! Thank you.
Just a thought. At the use points it isn't very clear what the worklist is for. Since it's global state across a lot of implementation maybe its name should be descriptive. e.g. dataflowWorkList.
Sorry, I did not mean to imply that APIs should be defined out-of-line when all they do is add abstraction. I was just referring to the algorithm itself because, with large methods, I start loosing track of how the surrounding definitions fit together. Instead of
In the last round, I was pointing out that you only need to do the following check during
So you don't need In fact, you don't really need
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On [SSADestroyHoisting] Adopt new utilities.
The pass is overall much nicer even with more powerful dataflow. It's hard to follow the steps of the algorithm with all the boilerplate. But I think that's mostly solved by moving the small method bodies in-line with the class definition--the ones that just provide abstraction or implement template interfaces, but don't have any nontrivial switches or loops.
281bfc5
to
3bf4de0
Compare
@atrick Thank you!
Changed:
Moved the implementation of simple methods back to the declarations.
Fixed--now we won't bother walking all the way to the begin of a block if we find a gen first.
Removed and moved body into summarizeLocalEffect.
Gotcha, that makes sense. So long as gens() is complete. I kept summarizeLocalEffect.
Removed no-longer-needed static method DeinitBarriers::classifyInstruction. Moved simple implementations to declarations. Made DestroyReachability handle the callbacks from VisitBarrierAccessScopes in addition to IterativeBackwardReachability. |
fb2813b
to
9d666ae
Compare
@swift-ci please test |
The utility provides a standard way for backwards traversing a region of a function constrained by a predicate (isInRegion) starting from some root blocks. It exposes a visit function for post-order and iterable collections for post-order and reverse post-order. Adding corresponding pre-order functionality will be straightforward.
The new optimistic, iterative backward reachability is optimized to do as little work as possible. Only blocks not all of whose successors are kills participate in the worklist at all. The blocks within that discovered set are visited via a worklist which tracks blocks which have been found to be unreachable at begin and whose unreachable-at-begin-ness must be propagated into unreachable-at-end-ness of its predecessors. rdar://92545900
The new utility finds access scopes which are barriers by finding access scopes which themselves contain barriers. This is necessary to (1) allow hoisting through access scopes when possible (i.e. not simply treating all end_access instructions as barriers) and (2) not hoist into access scopes that contain barriers and in so doing introduce exclusivity violations.
Instead of doing one or two non-iterative BackwardReachability runs, do a single run of IterativeBackwardReachability. During that, pause after discovery/local dataflow and use VisitBarrierAccessScopes to determine which end_access instructions in the discovered region are barriers. Add those instructions as kills to the dataflow. Finally run the global dataflow. Enables SSADestroyHoisting to hoist destroys over loops. Addresses a correctness issue where access scopes which were open at barrier blocks were not promoted to barriers, resulting in destroy_addrs getting hoisted into unrelated access scopes.
9d666ae
to
24979d4
Compare
@swift-ci please test macOS platform |
@swift-ci please test linux platform |
@swift-ci please test windows platform |
@swift-ci please test source compatibility |
@nate-chandler IterativeBackwardReachability looks great. @eeckstein the latest version is simplified quite a bit if you want to take a look. |
Previously, SSADestroyHoisting used two pessimistic BackwardReachability dataflows. Here, it is made to use IterativeBackwardReachability including a call to the optionally-used method solvePessimistic to do a pessimistic dataflow (to find barrier access scopes) and an optimistic dataflow (to find all barriers).