Skip to content

[SSADestroyHoisting] Handle loops. #58791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 24, 2022

Conversation

nate-chandler
Copy link
Contributor

Previously, SSADestroyHoisting used two pessimistic BackwardReachability dataflows. Here, it is made to use IterativeBackwardReachability including a call to the optionally-used method solvePessimistic to do a pessimistic dataflow (to find barrier access scopes) and an optimistic dataflow (to find all barriers).

Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice. I was able to read the code top-to-bottom, proving correctness at each step. The next stage in evolution will be to somehow factor all this to support forward reachability.

I would have moved the implementation out of the class definition, but that's just my style. I like to quickly see the class members and public API in one place.

Somewhere, we need to define what it means for an instruction to be both a gen and a kill. Presumably, the gen takes priority for backward traversal, but I don't think that's the current implementation.

bool setEndState(SILBasicBlock *block, State state) {

You could comment that this returns true if the state changes. It's description and code is complicated enough that it's not 100% obvious.

>  /// NOTE: Initialization occurs when stage == Stage::Initializing.  During
>  ///       that stage, it is legal to set an end state to ::Unknown, despite
>  ///       the fact that it was (implicitly, by not being recorded in either
>  ///       set of blocks) previously in the ::Reachable state.

Rather than confusing people with all this explanation and adding extra assertion logic, you could just not call this during initialization. This API actually makes initialization more complex as opposed to just directly calling result.unknownEndBlocks.insert(block);

>      auto previouslyUnknown = result.unknownEndBlocks.contains(block);
>      result.unknownEndBlocks.erase(block);

I don't know why BasicBlockSet::insert uses testAndSet but BasicBlockSet::erase does not. @eeckstein I suggest that BasicBlock::erase return a bool for readability and consistency with erase in other familier data types. A separate BasicBlockSet::reset can be used for maximum efficiency.

> #ifndef NDEBUG
>      auto previouslyUnreachable = result.unreachableEndBlocks.contains(block);
>      result.unreachableEndBlocks.erase(block);
>      // Transitioning up the lattice from unreachable to reachable is not
>      // allowed.  If it were, reaching a fixed point would not be guaranteed.
>      assert(!previouslyUnreachable);
>      return previouslyUnknown || previouslyUnreachable;
> #else
>      return previouslyUnknown;
> #endif

Please replace both cases of this pattern with:

assert(!result.unreachableEndBlocks.contains(block));
return previouslyUnknown;
>  Optional<Effect> findLocalEffect(SILInstruction *from,
>                                   Optional<Optional<Effect>> initialEffect,
>                                   Optional<Effect> &firstEffect) {

Double Optionals should be banned. Full stop. But...

  1. There's no reason to have an optional enum in the first place. "Optionality" should be part of the enum state: None, Kill, Gen.

Optionals should only be used to indicate that an API failed to compute the answer. For dataflow 'NoEffect is a valid effect that should not map to a missing value!

  1. The initialEffect is simpler to handle on the caller side. findLocalEffect should simply find the effect of the region of interest. The caller should then apply that effect to the initial value if it has one (that can be a wrapper of course).

You don't even really need firstEffect. A findPrevKill(inst) check is really what you want. Except for addKill(), where you only need a findPrevGen(). Those implementations would be trivial. It's better to duplicate a couple trivial lines of code than to add parameters and conditionalize implementation.

Instead of...

>  void initializeFromGen(SILInstruction *gen) {
>    //...
>    auto summaryEffect = findLocalEffect(gen, {Effect::Gen}, firstEffect);
>    if (isKill(firstEffect)) {
>      localGens.push_back(gen);
>      if (isKill(summaryEffect))
>        result.killBlocks.insert(block);

you could have...

  // previous gens get prioritized...
  bool recordKill(SILBasicBlock *block) {
    if (!result.genBlocks.contains(block)) {
      // If the block wasn't previously a gen block, we immediately know that
      // adding a kill to the block makes it a kill block.
      result.killBlocks.insert(block);
      return true;
    }
    return false;
  }

so initializeFromGen is simplified as...

  void initializeFromGen(SILInstruction *gen) {
    //...
    if (findPrevKill(gen)) {
      localGens.push_back(gen);
      recordKill(block);
    }

And addKill gets simplified to...

  void addKill(SILInstruction *instruction) {
    auto *block = instruction->getParent();
    if (!recordKill(block) && !findPrevGen(instruction))
      setEffect(Effect::Kill, block);

Then instead of findLocalEffects(SILBasicBlock *) you could have:

 Effect initLocalEffect(SILBasicBlock *block) {
   if (genBlocks.contains(block))
     return Effects:Gen;
   if (findPrevKill(block)) {
     result.killBlocks.insert(block);
     return Effects::Kill;
   }
   return Effects:NoEffect;
>  void propagateIntoPredecessors(SILBasicBlock *successor) {
>    ...
>    for (auto *predecessor : successor->getPredecessorBlocks()) {
>      if (!result.discoveredBlocks.contains(predecessor))
>        continue;

It's not clear to me how global data flow can propagate beyond discovered blocks. Should that be an assert?

>  bool findBarrier(SILInstruction *from, SILBasicBlock *block,
>...
>      if (*effect == Effect::Gen) {
>        assert(false && "found gen (before kill) in reachable block");

Why can't a gen be reachable? There's no joint-post-dominating gen guarantee.

> void initialize() {
> ...
>        // This block is a kill, so it is unreachable at the begin.  Propagate
>        // that unreachable-at-begin-ness to unreachable-at-begin-ness of its

"to unreachable-at-end-ness"

>    for (auto iterator = result.discoveredBlocks.begin();
>         iterator != result.discoveredBlocks.end(); ++iterator) {

Use an index to avoid iterator invalidation.

Why can't a gen be reachable? There's no joint-post-dominating gen guarantee.

> void initialize() {
> ...
>        // This block is a kill, so it is unreachable at the begin.  Propagate
>        // that unreachable-at-begin-ness to unreachable-at-begin-ness of its

"to unreachable-at-end-ness"

>    for (auto iterator = result.discoveredBlocks.begin();
>         iterator != result.discoveredBlocks.end(); ++iterator) {

>  void findBarriers(SmallSetVector<SILInstruction *, 4> &barrierInstructions,
>                    SmallSetVector<SILBasicBlock *, 4> &barrierPhis,
>                    SmallSetVector<SILBasicBlock *, 4> &barrierBlocks) {

I'm not a fan of APIs that take multiple by-ref data structures. I would package that up as struct Barriers.

We need to define "barrier block" somewhere. It's non-obviuos.

@nate-chandler nate-chandler force-pushed the rdar92545900 branch 3 times, most recently from 1af71be to 281bfc5 Compare May 19, 2022 04:02
@nate-chandler
Copy link
Contributor Author

@atrick, thank you again for your review!

There's now a new version of Reachability.h to which most of your suggestions have been applied.


moved the implementation out of the class definition

Done. Implementations along with longer comments are now at the end of Reachability, grouped according to usage. Important types for clients and relevant member functions are listed first, followed by implementation details.

setEndState
You could comment that this returns true if the state changes.

Done.

This API actually makes initialization more complex as opposed to just directly calling result.unknownEndBlocks.insert(block);

The setEndStateForBlock function is now only used for state transitions, not initialization.

As discussed, there is a separate initializeBeginStateToUnknown.

Both of these are now member functions on IterativeBlockReachability::Result.

The IterativeBackwardReachability::Result class is now solely responsible for dealing with tracking blocks' effects and states. Reachability itself just gets and sets states and effects for blocks, without concern for how they happen to be stored.

replace both cases of this pattern

Done.

"Optionality" should be part of the enum state

Done. IterativeBackwardReachability::Effect now has three cases: NoEffect, Kill, Gen.

findLocalEffect should simply find the effect of the region of interest

The variants of findLocalEffect have been replaced with: summarizeLocalEffect(SILBasicBlock *), summarizeLocalEffectBefore(SILInstruction *), and findFirstLocalEffectBefore(SILInstruction *, Effect target). The bodies of the second two are similar. The version that takes a block composes the effect of the back of the block with the effect before the block.

It's not clear to me how global data flow can propagate beyond discovered blocks. Should that be an assert?

No--at least notot without other changes. The point of this continue is to avoid propagating beyond discovered blocks. Blocks can be added to the worklist even if their predecessors aren't reachable. Blocks are added to the worklist whenever they're found to be unreachable-at-begin; there is no check that all of its predecessors are reachable.

A block is added to the worklist whenever its unreachable-at-begin. But its predecessors aren't marked as discovered if the block is a kill or if it's the def block.

Why can't a gen be reachable? There's no joint-post-dominating gen guarantee.

Fixed. Removed the assert.

"to unreachable-at-end-ness"

Fixed.

Use an index to avoid iterator invalidation.

Is this a problem for BasicBlockSetVector's iterator aka StackList's iterator? The iterator looks to be a wrapper around an index and the slab being indexed into. Can that be invalidated by growing the StackList? If it is, BasicBlockSetVector doesn't seem to support accessing elements by index.

package that up as struct Barriers

The findBarriers member function now takes a visitor whose three member functions are called to visit barrier instructions, blocks, and phis.

We need to define "barrier block" somewhere. It's non-obviuos.

Done. In the doc for findBarriers.

@nate-chandler
Copy link
Contributor Author

@swift-ci please test

@nate-chandler nate-chandler requested a review from atrick May 19, 2022 04:03
@nate-chandler
Copy link
Contributor Author

@swift-ci please test source compatibility

@atrick
Copy link
Contributor

atrick commented May 19, 2022

@nate-chandler Awesome! Thank you.

>  BasicBlockWorklist worklist;

Just a thought. At the use points it isn't very clear what the worklist is for. Since it's global state across a lot of implementation maybe its name should be descriptive. e.g. dataflowWorkList.

> template <typename Effects>
> void IterativeBackwardReachability<
>     Effects>::Result::initializeEndStateToUnknown(SILBasicBlock *block) {
>   unknownEndBlocks.insert(block);
> }

Sorry, I did not mean to imply that APIs should be defined out-of-line when all they do is add abstraction. I was just referring to the algorithm itself because, with large methods, I start loosing track of how the surrounding definitions fit together.

Instead of result.template setEffectForBlock</*SkipOverwrite=*/true>(block, effect), I would have done result.initEffectForBlock(block, effect) result.updateEffectForBlock(block, effect) wrappers.

>  // If the block was previously a gen block, the new kill may or may not make
>  // it a kill block.  Summarize the block again.
>  auto effect = Effect::Kill().compose(summarizeLocalEffectBefore(instruction));

In the last round, I was pointing out that you only need to do the following check during addKill:

findLocalEffectBefore(gen, Effect::Gen())

So you don't need summarizeLocalEffectBefore at all. You could just move its implementation into summarizeLocalEffect(SILBasicBlock *).

In fact, you don't really need summarizeLocalEffect(block). I think it's a nice API and makes things clear, so no need to remove it. But for the sake of discussion, the initialization loop over discovered blocks only needs to find kill blocks. So it could just skip 'gen' blocks and use

if (effects.effectForInstruction(end) == Effect::Kill()
    || findLocalEffectBefore(end, Effect::Kill()))

Copy link
Contributor

@atrick atrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On [SSADestroyHoisting] Adopt new utilities.

The pass is overall much nicer even with more powerful dataflow. It's hard to follow the steps of the algorithm with all the boilerplate. But I think that's mostly solved by moving the small method bodies in-line with the class definition--the ones that just provide abstraction or implement template interfaces, but don't have any nontrivial switches or loops.

@nate-chandler
Copy link
Contributor Author

@atrick Thank you!


maybe its name should be descriptive. e.g. dataflowWorkList

Changed: worklist -> dataflowWorklist.

initializeEndStateToUnknown

Moved the implementation of simple methods back to the declarations.

only need to do the following check during addKill

Fixed--now we won't bother walking all the way to the begin of a block if we find a gen first.

summarizeLocalEffectBefore

Removed and moved body into summarizeLocalEffect.

But for the sake of discussion, the initialization loop over discovered blocks only needs to find kill blocks.

Gotcha, that makes sense. So long as gens() is complete. I kept summarizeLocalEffect.

It's hard to follow the steps of the algorithm with all the boilerplate.

Removed no-longer-needed static method DeinitBarriers::classifyInstruction. Moved simple implementations to declarations. Made DestroyReachability handle the callbacks from VisitBarrierAccessScopes in addition to IterativeBackwardReachability.

@nate-chandler nate-chandler force-pushed the rdar92545900 branch 4 times, most recently from fb2813b to 9d666ae Compare May 21, 2022 17:14
@nate-chandler
Copy link
Contributor Author

@swift-ci please test

The utility provides a standard way for backwards traversing a region of
a function constrained by a predicate (isInRegion) starting from some
root blocks.  It exposes a visit function for post-order and iterable
collections for post-order and reverse post-order.  Adding corresponding
pre-order functionality will be straightforward.
The new optimistic, iterative backward reachability is optimized to do
as little work as possible.  Only blocks not all of whose successors are
kills participate in the worklist at all.  The blocks within that
discovered set are visited via a worklist which tracks blocks which have
been found to be unreachable at begin and whose
unreachable-at-begin-ness must be propagated into
unreachable-at-end-ness of its predecessors.

rdar://92545900
The new utility finds access scopes which are barriers by finding access
scopes which themselves contain barriers.  This is necessary to (1)
allow hoisting through access scopes when possible (i.e. not simply
treating all end_access instructions as barriers) and (2) not hoist into
access scopes that contain barriers and in so doing introduce
exclusivity violations.
Instead of doing one or two non-iterative BackwardReachability runs,
do a single run of IterativeBackwardReachability.  During that, pause
after discovery/local dataflow and use VisitBarrierAccessScopes to
determine which end_access instructions in the discovered region are
barriers.  Add those instructions as kills to the dataflow.  Finally run
the global dataflow.

Enables SSADestroyHoisting to hoist destroys over loops.

Addresses a correctness issue where access scopes which were open at
barrier blocks were not promoted to barriers, resulting in destroy_addrs
getting hoisted into unrelated access scopes.
@nate-chandler
Copy link
Contributor Author

@swift-ci please test macOS platform

@nate-chandler
Copy link
Contributor Author

@swift-ci please test linux platform

@nate-chandler
Copy link
Contributor Author

@swift-ci please test windows platform

@nate-chandler
Copy link
Contributor Author

@swift-ci please test source compatibility

@atrick
Copy link
Contributor

atrick commented May 25, 2022

@nate-chandler IterativeBackwardReachability looks great. @eeckstein the latest version is simplified quite a bit if you want to take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants