Skip to content

[Coroutines][LazyCallGraph] addSplitRefRecursiveFunctions allows spurious ref edges between new functions. #116285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

TylerNowicki
Copy link
Collaborator

@TylerNowicki TylerNowicki commented Nov 14, 2024

The addSplitRefRecursiveFunctions LazyCallGraph method should not require a reference between every new function. Spurious ref edges between the new functions are allowed and the new function are considered to be a RefSCC. This change clarifies that this is the case in the method's description and its DEBUG mode verifier to allow spurious ref edges.

@TylerNowicki TylerNowicki added coroutines C++20 coroutines llvm:analysis Includes value tracking, cost tables and constant folding labels Nov 14, 2024
@TylerNowicki TylerNowicki self-assigned this Nov 14, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2024

@llvm/pr-subscribers-coroutines

Author: Tyler Nowicki (TylerNowicki)

Changes

In this debug code the check assumes the lookup find an object. If it does not it tries to dereference the nullptr. This patch adds a check, however, notice it does not require the object to exist.

This method (addSplitRefRecursiveFunctions) was added for coroutines with continuations. The split coroutine transform generates the continuations by cloning the entire function then changing the entry block and a few other things. This results in a lot of dead code that is kept around only to satisfy a few assumptions then is removed.

With the dead code the original function and its continuations appear to be SCC (all functions reference all other functions). However, when the dead code is removed they are not SCC. Consider a simple coroutine { begin ... suspend ... suspend ... end }. After dead code is eliminated the ramp only references resume0, resume0 only references resume1, etc... I found the original PR [1] for addSplitRefRecursiveFunctions, but it seems to assume that the new resume functions will be a SCC.

In our use case the dead code can be a particular problem due to its size. For compile-time/size it is necessary to split the coroutine without generating dead code (e.g. CloneAndPrune). However, this also removes the dead references (on a phi's edges) to the other resume functions. The resulting resume functions are not an SCC, but this contradicts what has been written about addSplitRefRecursiveFunctions "To keep things simple, this only supports the case where all new edges are ref edges, and every new function references every other new function." [1]. The comments on are a little more relaxed saying "All new functions must reference (not call) each other." Does this mean that each resume function doesn't need to reference every other resume function?

From my testing it seems that this condition can be relaxed, and it is only this debug code that checks that "every other new function" is referenced!

Can we safely relax this such that the set of new (resume) functions is not necessarily an SCC?

[1] https://reviews.llvm.org/D93828#change-Xmlqywc9evLH


Full diff: https://github.com/llvm/llvm-project/pull/116285.diff

1 Files Affected:

  • (modified) llvm/lib/Analysis/LazyCallGraph.cpp (+3-2)
diff --git a/llvm/lib/Analysis/LazyCallGraph.cpp b/llvm/lib/Analysis/LazyCallGraph.cpp
index 5aa36bfc36d468..724bd0ff416c8a 100644
--- a/llvm/lib/Analysis/LazyCallGraph.cpp
+++ b/llvm/lib/Analysis/LazyCallGraph.cpp
@@ -1775,8 +1775,9 @@ void LazyCallGraph::addSplitRefRecursiveFunctions(
       if (F1 == F2)
         continue;
       Node &N2 = get(*F2);
-      assert(!N1->lookup(N2)->isCall() &&
-             "Edges between new functions must be ref edges");
+      assert(!N1->lookup(N2) ||
+             (!N1->lookup(N2)->isCall() &&
+              "Edges between new functions must be ref edges"));
     }
   }
 #endif

@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Tyler Nowicki (TylerNowicki)

Changes

In this debug code the check assumes the lookup find an object. If it does not it tries to dereference the nullptr. This patch adds a check, however, notice it does not require the object to exist.

This method (addSplitRefRecursiveFunctions) was added for coroutines with continuations. The split coroutine transform generates the continuations by cloning the entire function then changing the entry block and a few other things. This results in a lot of dead code that is kept around only to satisfy a few assumptions then is removed.

With the dead code the original function and its continuations appear to be SCC (all functions reference all other functions). However, when the dead code is removed they are not SCC. Consider a simple coroutine { begin ... suspend ... suspend ... end }. After dead code is eliminated the ramp only references resume0, resume0 only references resume1, etc... I found the original PR [1] for addSplitRefRecursiveFunctions, but it seems to assume that the new resume functions will be a SCC.

In our use case the dead code can be a particular problem due to its size. For compile-time/size it is necessary to split the coroutine without generating dead code (e.g. CloneAndPrune). However, this also removes the dead references (on a phi's edges) to the other resume functions. The resulting resume functions are not an SCC, but this contradicts what has been written about addSplitRefRecursiveFunctions "To keep things simple, this only supports the case where all new edges are ref edges, and every new function references every other new function." [1]. The comments on are a little more relaxed saying "All new functions must reference (not call) each other." Does this mean that each resume function doesn't need to reference every other resume function?

From my testing it seems that this condition can be relaxed, and it is only this debug code that checks that "every other new function" is referenced!

Can we safely relax this such that the set of new (resume) functions is not necessarily an SCC?

[1] https://reviews.llvm.org/D93828#change-Xmlqywc9evLH


Full diff: https://github.com/llvm/llvm-project/pull/116285.diff

1 Files Affected:

  • (modified) llvm/lib/Analysis/LazyCallGraph.cpp (+3-2)
diff --git a/llvm/lib/Analysis/LazyCallGraph.cpp b/llvm/lib/Analysis/LazyCallGraph.cpp
index 5aa36bfc36d468..724bd0ff416c8a 100644
--- a/llvm/lib/Analysis/LazyCallGraph.cpp
+++ b/llvm/lib/Analysis/LazyCallGraph.cpp
@@ -1775,8 +1775,9 @@ void LazyCallGraph::addSplitRefRecursiveFunctions(
       if (F1 == F2)
         continue;
       Node &N2 = get(*F2);
-      assert(!N1->lookup(N2)->isCall() &&
-             "Edges between new functions must be ref edges");
+      assert(!N1->lookup(N2) ||
+             (!N1->lookup(N2)->isCall() &&
+              "Edges between new functions must be ref edges"));
     }
   }
 #endif

TylerNowicki pushed a commit to TylerNowicki/llvm-project that referenced this pull request Nov 14, 2024
I am confused by what replaceSwiftErrorOps is supposed to do and it
doesn't seem to be well covered by lit-tests. At least in tree.

The function appears to primarily operate on the original function,
because it processes the SwiftErrorOps in Shape, collected from the
unsplit function. However, it is called during cloning process of each
resume function and from reading the code it seems to do something
strange.

After cloning the first resume funnction it may add Load and Store
instructions to the original function. These would then appear in any
subsequent resume functions that are cloned from the original, but not
the one being processed. Instead an alloca will be created in the first
resume function. After the first call to replaceSwiftErrorOps the
SwiftErrorOps list is cleared so no other resume functions will get the
alloca. Following this replaceSwiftErrorOps is called again after
splitting but that would do nothing (right?).

Removing the call within the Cloner::create() doesn't break any
lit-tests. Can this be safely removed?

I am looking at this because I am working on splitting. As explained in
llvm#116285 I want to CloneAndPrune
to create resume functions that only include the code they need and not
the entire original function. However, this call causes
coro-swifterror.ll to fail by:

swifterror argument should come from an alloca or parameter
ptr poison
  tail call void @maybeThrow(ptr swifterror poison)

The swifterror argument is not correctly used in a few places in the IR
and some how removing the replaceSwiftErrorOps call in Cloner::create()
resolves the problem.
@yuxuanchen1997
Copy link
Member

yuxuanchen1997 commented Nov 14, 2024

I am a bit uncomfortable with general CGSCC mutation API changes. Reading the old diff, you referenced it looks like this design is deliberate. Don't we have the .resumers array GV for this? Would it work by fabricating another reference to it? You mentioned continuation, a different ABI.

@TylerNowicki
Copy link
Collaborator Author

I am a bit uncomfortable with general CGSCC mutation API changes. Reading the old diff, you referenced it looks like this design is deliberate. Don't we have the .resumers array GV for this? Would it work by fabricating another reference to it? You mentioned continuation, a different ABI.

As far as I can see this is only used by CoroSplit, the method was introduced to add a group of continuations.

There are certainly ways to work around this, but any solution is a kind of hack. For example it should be possible to add dead code that makes each resume reference all other the others (as is what happens right now).

I was hoping with this PR to gain a better understanding of how CGSCC and coroutines are working together. Currently that is not well documented although I may have missed it. And perhaps find a way to avoid using hacks.

@ChuanqiXu9 ChuanqiXu9 requested a review from rjmccall November 15, 2024 02:08
@ChuanqiXu9
Copy link
Member

Yeah, this is not the switch ABI. Let's hear the opinion from @rjmccall

@yuxuanchen1997
Copy link
Member

Yeah, this is not the switch ABI. Let's hear the opinion from @rjmccall

Switch ABI suffers from the same problem. I think that's part of the reason we have f.resumers. Anyway, it will be good to understand why the constraints are here and what are the best possible contracts with LazyCallGraph.

@rjmccall
Copy link
Contributor

IIRC, the coro passes are trying to be cooperative by updating the call graph after splitting rather than just invalidating the analysis in its entirety. I don't think it's lowering-specific in any way. I don't have any strong opinion about how this is done, so if the code is wrong, or if it's causing more trouble than it's worth, feel free to rip it out.

@aeubanks
Copy link
Contributor

The (Ref)SCCs must be accurate in regards to the actual IR. addSplitRefRecursiveFunctions() I believe only has the restriction that the new split functions are each in their own SCC but are in the same RefSCC, meaning they have (potentially indirect) circular references between all the split functions that are all ref edges, but they don't necessarily have to all reference each other directly.

If we have multiple split functions and one of them doesn't reference any of the others, there's addSplitFunction() for that case, and any remaining ref-recursive functions can be added with addSplitRefRecursiveFunctions()

@rjmccall
Copy link
Contributor

Yeah, it's probably not trivial to get that information right during splitting because the cloned functions can end up containing arbitrary subsets of the original function.

@aeubanks
Copy link
Contributor

we already have to populate the new nodes per new function, maybe we can change the LazyCallGraph methods to be a bit smarter by detecting which of the split functions actually reference each other?

@TylerNowicki
Copy link
Collaborator Author

TylerNowicki commented Nov 20, 2024

I have had to do some studying on this to understand the discussion. I think I am able to follow it now.

The implementation of addSplitRefRecursiveFunctions, in the normal case, will create a new (Ref)SCC for the continuations. That is 1 (Ref)SCC to contain all the continuations. Then within this an SCC is created for each continuation individually. It does not actually require that any of the continuations directly call each other. Additionally, it doesn't attempt to merge any of these SCCs either. That seems reasonable because although one continuation will transitively reach the next this isn't done by a direct call. Instead, the coroutine caller is the one that invokes each continuation. Each continuation will contain a function ptr to the next and it is that which the coroutine caller invokes. Consequently, the continuations are each in their own SCC!

There is an additional special case that may reuse the (Ref)SCC of the original (ramp) function.

What I describe above does not hold for Switch ABI. In this ABI the resume, destroy and clean-up are individually added to the call graph with addSplitFunction(). This method handles 3 cases: 1) the new function has a direct call to any function in the original SCC, so it is in the same SCC, 2) the new function has a direct call to any function in the original (Ref)SCC, so it is in the same (Ref)SCC but not the same SCC, 3) no direct calls to the original (Ref)SCC/SCC, so it is in its own (Ref)SCC. The second and third cases are what is handled by addSplitRefRecursiveFunctions for the group of continuations.

Based on this and what was said I believe the existing implementation of addSplitRefRecursiveFunctions is perfectly fine. It is only some of the comments and the debug code (already in this MR) that should be updated. I will update the MR to include these other changes. I have also started writing a discussion on the topic how coroutines update the call-graph that I will create a separate PR for soon.

Please let me know if anything I said is not accurate/correct.

@TylerNowicki TylerNowicki force-pushed the amd/dev/tnowicki/lazy.call.graph.coro branch from 3075d2d to ab709b1 Compare November 25, 2024 18:28
@TylerNowicki
Copy link
Collaborator Author

The patch is ready for re-review.

I decided that it makes sense to insert a ref from each new function to all other new functions. This will ensure all new functions are within the same Ref(SCC). A ref from the original function to all new functions was already being added so this is an extension of that. I also updated the comments to reflect this and added an additional check in the assertion, although at this point the assertion may not be necessary because it is adding the refs if they were not already there. Should the debug checks be removed now?

This change works with the current way coroutines are (fully) cloned, because full cloning creates a PHI each in new function that references all other new functions. So, ensuring a ref exists between all new functions in addSplitRefRecursiveFunctions does not change what is already happening. If the cloning is modified to only clone BBs that are in each continuation, then PHI will not be include all new functions. So, ensuring a ref exists between all new functions ensures the Ref(SCC) requirement is met as well.

I will stage my changes for pruning while cloning in future PRs.

@TylerNowicki
Copy link
Collaborator Author

TylerNowicki commented Dec 5, 2024

Ping, any comments on the changes?

@rjmccall
Copy link
Contributor

rjmccall commented Dec 5, 2024

That sounds roughly right. There's no need to add a reference back to the initial funclet in any lowering, and the switch lowering only strictly needs to ref the two new funclets from the initial funclet, but otherwise, adding a ref from every funclet to every non-initial funclet is definitely the conservatively correct way to handle all of the lowerings that use non-unified continuations (which is everything but the switch lowering).

@ChuanqiXu9 ChuanqiXu9 requested a review from nikic December 6, 2024 02:28
@ChuanqiXu9
Copy link
Member

Given the changes are not in Coroutines now, I add @nikic as the global maintainers to make a decide (or adding other reviewers)

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, was out on vacation then got sick

should have tests in llvm/unittests/Analysis/LazyCallGraphTest.cpp

Comment on lines 1721 to +1723
Node &NewN = initNode(*NewFunction);

// Make the original function reference each new function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't initNode above populate the edges out of the node? that should already be adding the edges in the code you're adding below

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this doesn't add edges between nodes if there are no edges between them. Right now when we create a continuation we clone the whole thing, the entire function and fixup the entry point, then it is added to the LazyCallGraph (so there is a phi in the continuation with references to all other new continuations), then do a clean-up that removes all the unused code including the phi.

Obviously cloning the whole function, just for the phi references, then removing all of the unused code is wasted compile-time. I will be making a change that prunes the BBs while creating the continuation. The only problem is now the phi with references to the other continuations is not there. But that phi being there is a bit artificial anyway since we remove it. Instead, this function can add the edges that were not represented in the continuation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LazyCallGraph models the actual IR, so if there aren't any references in the actual IR, there shouldn't be an edge in LazyCallGraph. The LazyCallGraph APIs should be called after all transformations are done. I think I'm not fully understanding your comment, a concrete example would be helpful.

Copy link
Collaborator Author

@TylerNowicki TylerNowicki Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you describe differs from what is happening right now (without my patch). Here are some more details on how it works right now. First for the coroutine function we create a single/unified return block for each suspend point and create a phi in the return BB that takes a reference to a continuation function. We create one empty continuation function for each suspend point. Then for each continuation function we clone the contents of the coroutine function in its entirety and fix-up the entry point. Notice that the return BB and its phi are retained in each continuation function. Then we call addSplitRefRecursiveFunctions and then we do simplify-cfg. The simplify-cfg removes all of the unnecessary blocks and as a result it will remove some or all of the edges in the phi we added to the return BB.

For example, if we have a coroutine like

foo() {
 // part A
 suspend1:
 // part B
 suspend2:
 // part C
 exit:
}

Then we first generate

foo() {
 // part A
 suspend1:
 // part B
 suspend2:
 // part C
 exit:v
 returnBB:
 phi = [continuationB suspend1, continuationC, suspend2]
}

continuationB () {
 // part A
 suspend1:
 // part B  <-- entry point
 suspend2:
 // part C
 exit:
 returnBB:
 phi = [continuationB suspend1, continuationC, suspend2]
}

continuationC () {
 // part A
 suspend1:
 // part B
 suspend2:
 // part C <-- entry point
 exit:
 returnBB:
 phi = [continuationB suspend1, continuationC, suspend2]
}

At this point we call addSplitRefRecursiveFunctions, adding the new continuation functions to the call graph and they will form a (Ref)SCC.

However, then we do simplify-cfg and the resulting functions look like

foo() {
 // part A
 returnBB:
 phi = [continuationB suspend1]
}

continuationB () {
 // part B  <-- entry point
 returnBB
 phi = [continuationC, suspend2]
}

continuationC () {
 // part C <-- entry point
 exit:
}

So, the question is, what is the correct call-graph? Based on previous comments I assumed we actually wanted to keep the current behavior and keep the continuation functions in the same (Ref)SCC. See an earlier comment by John.

adding a ref from every funclet to every non-initial funclet is definitely the conservatively correct way to handle all of the lowerings that use non-unified continuations (which is everything but the switch lowering).

The Async method and the returned-continuation method both use addSplitRefRecursiveFunctions to add their new continuation functions to the call graph. As John said we don't need to worry about the switch method because it uses addSplitFunction to add its funclets to the call graph.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a unit test that resembles the above example.

Copy link
Collaborator Author

@TylerNowicki TylerNowicki Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just have addSplitRefRecursiveFunctions() call updateCGAndAnalysisManagerForCGSCCPass() then? Otherwise, it feels like we are just reinventing the function to identify the SCCs and RefSCCs in addSplitRefRecursiveFunctions(). The alternative is before adding the new functions to the CallGraph I add unreachable blocks to create 'fake' references between the new functions on the incoming edges of an unused phi node to give the new functions the appearance of being a RefSCC. Then after addSplitRefRecursiveFunctions() I delete those and call updateCGAndAnalysisManagerForCGSCCPass().

Both of these options don't sound great. It would be a lot better if the rule was that the callgraph matches the IR following completion of the pass. It seems overly restrictive to require the call-graph to match the IR at arbitrary points in the pass transformation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm confused based on the description and the previous discussion. Are you planning on first creating all the split functions in full (referencing each other), then one by one simplify-cfg each one, or are you planning on simplifying the cfg as you clone, meaning there's never actually a point where we have a fully connected RefSCC? The discussion above seems to imply the former, and the PR description seems to imply the latter.

If it's the former, then something like

coroSplit(Function& F, LazyCallGraph& LCG) {
  std::vector<Function *> SplitFunctions = doSplit(F);
  LCG.addSplitRefRecursiveFunctions(F, SplitFunctions);
  for (Function& NewF: SplitFunctions) {
    runSimplifyCfg(NewF);
    updateCGAndAnalysisManagerForFunctionPass(LCG, NewF);
    // updateCGAndAnalysisManagerForFunctionPass handles all possible transformations done by simplify-cfg
    // updateCGAndAnalysisManagerForCGSCCPass is not necessary since we only did function pass transformations since the last LCG update
  }
}

is simple and should work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My request is the latter option you gave. Sorry for the confusion. I will create an RFC to explain my request more concisely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another clarifying question, does the original function still always reference all newly split functions?

Copy link
Collaborator Author

@TylerNowicki TylerNowicki Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another clarifying question, does the original function still always reference all newly split functions?

Yes, that is the case. I don't see that changing.

@TylerNowicki TylerNowicki force-pushed the amd/dev/tnowicki/lazy.call.graph.coro branch 2 times, most recently from 84a0c80 to 21b2331 Compare December 30, 2024 21:49
@TylerNowicki TylerNowicki force-pushed the amd/dev/tnowicki/lazy.call.graph.coro branch 2 times, most recently from a78512f to bc89212 Compare January 10, 2025 18:18
Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

having thought about this a bit more, I think this use case makes sense and this change is fine. we can use the fact that spurious ref edges are ok (I hope this doesn't come back to bite us)

we should add more comments explaining this case where the new functions may not necessarily reference each other and that's ok due to spurious ref edges being allowed

Comment on lines 1090 to 1092
/// The CG must be updated following the use of this helper, for example with
/// updateCGAndAnalysisManagerForCGSCCPass(), to ensure the RefSCCs and SCCs
/// are correctly identified.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment isn't really true. if all you do is create new functions and add ref edges from the original function to the new functions, you technically don't need to call updateCGAndAnalysisManagerForCGSCCPass() since addSplitRefRecursiveFunctions() will take care of all the call graph updates, and (I believe) there are no analysis manager updates to be done. the point of updateCGAndAnalysisManagerForCGSCCPass() is to do the call graph update for you when there are potential edge additions/removals that can change the call graph and requeue any modified SCCs to be visited again. but if the pass does a limited number of transformations that can be expressed as LazyCallGraph API calls, doesn't do anything that would invalidate analysis manager results, and requeues any potentially modified SCCs (like the CoroSplit code that does that I mentioned above), then updateCGAndAnalysisManagerForCGSCCPass() is unnecessary.

so I'd remove this paragraph

Copy link
Collaborator Author

@TylerNowicki TylerNowicki Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point of updateCGAndAnalysisManagerForCGSCCPass() is to do the call graph update for you when there are potential edge additions/removals that can change the call graph and requeue any modified SCCs to be visited again

This point is new to me actually. I only previously understood that we wanted the call-graph's (Ref)SCCs to match the IR. But it sounds like what you are saying is that it will recompute the list of SCCs that need to be visited. In our case the original and new functions are requeued by addSplitRefRecursiveFunctions, so no need to call updateCGAndAnalysisManagerForCGSCCPass().

Just to clarify: after we call addSplitRefRecursiveFunctions we immediately call updateCGAndAnalysisManagerForCGSCCPass, followed by simplifying the original function, then a call to updateCGAndAnalysisManagerForFunctionPass. You can see this in CoroSplit.cpp::updateCallGraphAfterCoroutineSplit(...). When we simplify the original function some of the references to the new functions may be removed. I am starting to suspect the call to updateCGAndAnalysisManagerForCGSCCPass is not required, does that match your interpretation as well?

  • In older messages I may have been confused about when cleanup was done. The above is correct as I understand it.
  • (like the CoroSplit code that does that I mentioned above), then updateCGAndAnalysisManagerForCGSCCPass() is unnecessary.

  • Yes, your code also omits the call to updateCGAndAnalysisManagerForCGSCCPass. Interesting!

another clarifying question, does the original function still always reference all newly split functions?

To clarify your question below as it relates to this. At the time when addSplitRefRecursiveFunctions and updateCGAndAnalysisManagerForCGSCCPass are called the original function still always references all the newly split functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case the original and new functions are requeued by addSplitRefRecursiveFunctions, so no need to call updateCGAndAnalysisManagerForCGSCCPass().

addSplitRefRecursiveFunctions() doesn't requeue SCCs to be revisited (since it's just a LazyCallGraph method that has no idea of SCC pass manager scheduling), either updateCGAndAnalysisManagerFor*Pass() does here, or you can do it manually like CoroSplit does here.

Just to clarify: after we call addSplitRefRecursiveFunctions we immediately call updateCGAndAnalysisManagerForCGSCCPass, followed by simplifying the original function, then a call to updateCGAndAnalysisManagerForFunctionPass. You can see this in CoroSplit.cpp::updateCallGraphAfterCoroutineSplit(...). When we simplify the original function some of the references to the new functions may be removed. I am starting to suspect the call to updateCGAndAnalysisManagerForCGSCCPass is not required, does that match your interpretation as well?

yes, the call to updateCGAndAnalysisManagerForCGSCCPass() shouldn't be necessary since you've already updated LazyCallGraph, but you would still need to manually requeue the new SCCs to be visited, which CoroSplit already does as mentioned above.

@TylerNowicki
Copy link
Collaborator Author

having thought about this a bit more, I think this use case makes sense and this change is fine. we can use the fact that spurious ref edges are ok (I hope this doesn't come back to bite us)

we should add more comments explaining this case where the new functions may not necessarily reference each other and that's ok due to spurious ref edges being allowed

Sorry for the delayed response, I am returning to this now. Thank you for your help on this!

I just want to be clear we are on the same page. To recap, I am modifying how new functions are created during the coroutine split pass. I am planning on simplifying the cfg as I clone, meaning there's never actually a point where we have a fully connected RefSCC. To clarify your prior message by "spurious ref edges", you are referring to the missing ref edges between new functions in the RefSCC. And this sounds fine to you. I just want to be clear.

TylerNowicki and others added 4 commits February 12, 2025 17:41
In this debug code the check assumes the lookup find an object. If it
does not it tries to dereference the nullptr. This patch adds a check,
however, notice it does not require the object to exist.

This method (addSplitRefRecursiveFunctions) was added for coroutines
with continuations. The split coroutine transform generates the
continuations by cloning the entire function then changing the entry
block and a few other things. This results in a lot of dead code that is
kept around only to satisfy a few assumptions then is removed.

With the deadcode the original function and its continuations appear to
be SCC. However, when the deadcode is removed they are not SCC. Consider
a simple coroutine { begin ... suspend ... suspend ... end }. After
deadcode is eliminated the ramp references resume0, resume0 references
resume1, etc...

To be efficient it is necessary to split coroutines without generating
deadcode and this removese the unnecessary references (on a phi's edges).
This does not satisfy one of the conditions of addSplitRefRecursiveFunctions:
"All new functions must reference (not call) each other.".

From my testing so far it seems that this condition is not necessary and
it is only this debug code that checks for it.

Can we safely remove the "All new functions must reference (not call) each
other." requirement of addSplitRefRecursiveFunctions?

If not, what alternative do we have so we avoid cloning deadcode? (In
our use case the deadcode can be a particular problem due to its size).
@TylerNowicki TylerNowicki force-pushed the amd/dev/tnowicki/lazy.call.graph.coro branch from bc89212 to 9331a61 Compare February 12, 2025 23:22
@aeubanks
Copy link
Contributor

To clarify your prior message by "spurious ref edges", you are referring to the missing ref edges between new functions in the RefSCC. And this sounds fine to you.

Yup I think we're on the same page

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR lgtm, I'd remove the original PR description section and add some more notes on the spurious ref edge in the description

@aeubanks
Copy link
Contributor

and please add me as a reviewer for the CoroSplit change when you get around to that

@TylerNowicki TylerNowicki changed the title [Coroutines][LazyCallGraph] resumes are not really SCC [Coroutines][LazyCallGraph] addSplitRefRecursiveFunctions allows spurious ref edges between new functions. Feb 14, 2025
@TylerNowicki TylerNowicki merged commit c662103 into llvm:main Feb 14, 2025
8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 14, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/13821

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Analysis/./AnalysisTests/79/171' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/b/1/llvm-clang-x86_64-expensive-checks-debian/build/unittests/Analysis/./AnalysisTests-LLVM-Unit-3491294-79-171.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=171 GTEST_SHARD_INDEX=79 /b/1/llvm-clang-x86_64-expensive-checks-debian/build/unittests/Analysis/./AnalysisTests
--

Note: This is test shard 80 of 171.
[==========] Running 4 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from CtxProfAnalysisTest
[ RUN      ] CtxProfAnalysisTest.GetCallsiteIDNegativeTest
[       OK ] CtxProfAnalysisTest.GetCallsiteIDNegativeTest (0 ms)
[----------] 1 test from CtxProfAnalysisTest (0 ms total)

[----------] 1 test from LazyCallGraphTest
[ RUN      ] LazyCallGraphTest.AddSplitFunctions6
AnalysisTests: /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/lib/Analysis/LazyCallGraph.cpp:403: void llvm::LazyCallGraph::RefSCC::verify(): Assertion `Visited.contains(NodeToVisit) && "Cannot reach all nodes within RefSCC"' failed.
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  AnalysisTests   0x00000000011dbb47 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 39
1  AnalysisTests   0x00000000011d95cc llvm::sys::RunSignalHandlers() + 188
2  AnalysisTests   0x00000000011dc35a
3  libpthread.so.0 0x00007effb2dc5140
4  libc.so.6       0x00007effb28d9d51 gsignal + 321
5  libc.so.6       0x00007effb28c3537 abort + 291
6  libc.so.6       0x00007effb28c340f
7  libc.so.6       0x00007effb28d26d2
8  AnalysisTests   0x0000000000b677a8 llvm::LazyCallGraph::RefSCC::verify() + 3224
9  AnalysisTests   0x0000000000b76b0a llvm::LazyCallGraph::addSplitRefRecursiveFunctions(llvm::Function&, llvm::ArrayRef<llvm::Function*>) + 2234
10 AnalysisTests   0x000000000094e875
11 AnalysisTests   0x000000000160d74c testing::Test::Run() + 348
12 AnalysisTests   0x000000000160eba9 testing::TestInfo::Run() + 457
13 AnalysisTests   0x000000000160faa6 testing::TestSuite::Run() + 1014
14 AnalysisTests   0x0000000001623fed testing::internal::UnitTestImpl::RunAllTests() + 2909
15 AnalysisTests   0x000000000162334a testing::UnitTest::Run() + 90
16 AnalysisTests   0x00000000015edccc main + 140
17 libc.so.6       0x00007effb28c4d7a __libc_start_main + 234
18 AnalysisTests   0x000000000084b37a _start + 42

--
exit: -6
--
shard JSON output does not exist: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/unittests/Analysis/./AnalysisTests-LLVM-Unit-3491294-79-171.json
********************


@aeubanks
Copy link
Contributor

buildbot looks like a real failure under expensive checks, reverting to keep tree green

aeubanks added a commit that referenced this pull request Feb 15, 2025
…ows spurious ref edges between new functions." (#127285)

Reverts #116285

Breaks expensive checks build, e.g.
https://lab.llvm.org/buildbot/#/builders/16/builds/13821
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 15, 2025
…nctions allows spurious ref edges between new functions." (#127285)

Reverts llvm/llvm-project#116285

Breaks expensive checks build, e.g.
https://lab.llvm.org/buildbot/#/builders/16/builds/13821
@@ -1775,8 +1776,9 @@ void LazyCallGraph::addSplitRefRecursiveFunctions(
if (F1 == F2)
continue;
Node &N2 = get(*F2);
assert(!N1->lookup(N2)->isCall() &&
"Edges between new functions must be ref edges");
assert(!N1->lookup(N2) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it seems like we're not actually adding the spurious ref edges since Node::populateSlow() doesn't actually add the ref edge since it doesn't exist in the IR, so the RefSCC being created isn't a valid strongly connected component. we need to manually add the fake ref edges (which feels weird but oh well) as opposed to relaxing this check to allow the edge to not exist

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion on this. I will make another PR for this change soon.

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…ious ref edges between new functions. (llvm#116285)

The addSplitRefRecursiveFunctions LazyCallGraph helper should not
require a reference between every new function. Spurious ref edges
between the new functions are allowed and the new function are
considered to be a RefSCC. This change clarifies that this is the case
in the method's description and its DEBUG mode verifier.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
coroutines C++20 coroutines llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants