Skip to content

[Coroutines] Move the SuspendCrossingInfo analysis helper into its own header/source #106306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

TylerNowicki
Copy link
Collaborator

@TylerNowicki TylerNowicki commented Aug 27, 2024

  • Move the SuspendCrossingInfo analysis helper into its own header/source

See RFC for more info: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor branch from 1582d12 to 06cde7c Compare September 3, 2024 14:28
Copy link

github-actions bot commented Sep 3, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2024

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-coroutines

Author: Tyler Nowicki (TylerNowicki)

Changes

Hi All,

This is an initial change as part of a series of PRs (see below) to add new capabilities to the llvm coroutine transformations by providing a mechanism that can be used to separate ABI and user code and at the same time supporting a wider variety of users beyond Swift and C++.

TLDR

This proposal adds an ABI object using a class hierarchy that can be used to implement the base ABIs (Switch, Asyc, and Retcon{Once}), as well as allowing customization of these ABIs through inheritance. Generator lambdas provided to CoroSplit are used to create the customized ABIs according to the new intrinsic llvm.coro.begin.custom. These changes will allow use to improve the separation of the code related to users, ABIs and utilities. Additionally, user specific code that need not be part of the llvm coroutine transforms can be implement within their users plugin libraries, so no need to pollute the upstream code. No code changes are required by any existing users. Instead the changes will allow future development and refactoring to incrementally improve the separation of code for different ABIs, users and shared utilities.

Problems Addressed

Separation of user code: Users such as Swift and C++ require specific behavior from the CoroSplit pass and its related helpers, especially, CoroFrame. However, that has resulted in code specific to certain users being shared and executed by all the users.

buildCoroutineFrame
> // For now, this works for C++ programs only.
> buildFrameDebugInfo(F, Shape, FrameData);

Separation of ABI code: To address the most significant cases of the above an ABI enum was added and is used by the many methods, functions, classes, etc. invoked by CoroSplit. Some of these uses perform a delegation of certain operations, such as how function splitting is invoked (below). However, in other places the uses of the enum is used for fine-grained differences between the ABIs such as those uses in buildCoroutineFrame (below) and it does not provide a mechanism to separate the code of different users.

splitCoroutine
> switch (Shape.ABI) {
> case coro::ABI::Switch:
> SwitchCoroutineSplitter::split(F, Shape, Clones, TTI);
> break;
> case coro::ABI::Async:
> splitAsyncCoroutine(F, Shape, Clones, TTI);
> break;
> case coro::ABI::Retcon:
> case coro::ABI::RetconOnce:
> splitRetconCoroutine(F, Shape, Clones, TTI);
> break;
> }

buildCoroutineFrame
> if (Shape.ABI != coro::ABI::Async && Shape.ABI != coro::ABI::Retcon &&
> Shape.ABI != coro::ABI::RetconOnce)
> sinkLifetimeStartMarkers(F, Shape, Checker, DT);

Materializable callback: This was added to solve functional issues with our non-C++, non-Swift user. It provides a limited method to control rematerialization of pre-suspend instructions after each suspend point and it is the only supported mechanism to customize CoroSplit. However, it is a function callback so it cannot access any state info being used by the CoroSplit pass. This makes it difficult to use to do anything more than checking the type of instruction and making a determination from that. The callback cannot be used to make an individual decision for each instruction and suspend point, i.e. it is not possible to materialize an inst based on its use or materialize an inst after a subset of suspend points that it may cross.

Plugin Libraries: Other uses such as C++ and Swift would ideally implement their own specific behavior within a plugin library that resides within their repos. Consider for example that llvm.coro.alloca.alloc and llvm.coro.alloca.free require a fair bit of code within CoroFrame and other helpers to handle them. This includes the classes: CoroAllocaAllocInst, CoroAllocaFreeInst and methods: lowerLocalAllocas, lowerNonLocalAlloca, code within buildCoroutineFrame and various others. However, as far as I can tell these intrinsics are only used by Swift and they are not even documented in https://llvm.org/docs/Coroutines.html. The problem is that most of the interfaces that are required by users for customization are not accessible to plugin libraries and this means right now user code must be embedded into the llvm transform passes.

Proposed solution

  1. Move helpers and utilities that are used by various ABIs/users into separate sources. For example, Shape can be moved to its own header so other users can use it.

  2. Split and simplify certain methods that can be used to 'delegate' certain operations of an ABI. For example, buildCoroutineFrame and the Shape constructor.

  3. For ABI code separation define several ABI objects (SwitchABI, AsyncABI, AnyRetConABI) using a class hierarchy. Each ABI implements a set of virtual methods, using the shared utils, to provide the specific behavior required.

  4. For user code separation, improved materialization, and to support plugin libraries add a constructor to CoroSplit that takes a list of ABI generator lambdas, the ABIs inherit from the above set and provide specific behavior. To select the generator a new intrinsic llvm.coro.begin.custom is used in place of llvm.coro.begin.

  5. For plugin librariers the utility headers are moved into include/llvm/Transform/Coroutines.

Discussion

  • The ABI objects use a delegate style approach, each virtual method performs a high-level operation required by CoroSplit. For example, buildCoroutineFrame, and splitCoroutine are virtual methods in the ABI object.
  • Users requiring specific behavior will: 1) implement their desired behavior by inheriting one of the ABI objects (SwitchABI, AsyncABI, AnyRetConABI), 2) provide a list of lambda generators for their customized ABIs when adding CoroSplit to the CGPM, and 3) specify their customized ABI by using the coro.begin.custom intrinsic instead of llvm.coro.begin.
  • It is not necessary to make any changes to existing user code. There is no need to update clang or swift. Those who work on c++ coroutines or swift may want to take advantage of the customizability afforded by these changes.
  • There is no need to make any changes to user code that uses the existing Materialization Callback. Notice I didn't have to modify the test case at all.
  • I added an additional test case for using a custom ABI object.
  • It is out of scope of these changes to rework everything. The purpose is to provide a new way to use the coroutine transformations, not necessarily to force all existing code and users to adopt this mechanism.
  • The series of PRs I am proposing will not seek to remove all uses of the Shape.ABI enum. Not all uses may be necessary or practical to remove and some uses may be considered beneficial to have (i.e. a utility method that works with any ABI).
  • Users such as c++ and swift may benefit from creating their own customized ABIs, but it is out of scope for this series of PRs to do so.

Proposed PRs

I propose the following series of changes, many of these changes are good changes on their own and can be merged while discussing the ABI object (PR 6) proposal.

  1. Move suspend crossing info into its own h/cpp (this PR).

  2. Move spilling to its own utils [llvm/llvm-project][Coroutines] Move spill related methods to a Spill utils TylerNowicki/llvm-project#1

  3. Separate normalization from buildCoroutineFrame: [llvm/llvm-project][Coroutines] Split buildCoroutineFrame TylerNowicki/llvm-project#2

  4. Move materialization to its own utils [llvm/llvm-project][Coroutines] Move materialization out of CoroFrame to MaterializationUtils.h TylerNowicki/llvm-project#3

  5. Move Shape to its own header [Coroutines] Move Shape to its own header TylerNowicki/llvm-project#4

  6. ABI Object [Coroutines] ABI Object TylerNowicki/llvm-project#5

  7. Move util headers to include/llvm for testing and plugin libraries [Coroutines] Move util headers to include/llvm TylerNowicki/llvm-project#6

  8. Support custom ABIs and plugin libraries [Coroutines] Support Custom ABIs and plugin libraries TylerNowicki/llvm-project#7
    Note: This includes a unit test for custom ABIs as used by plugin libraries.

  • Please let me know if you are not able to access the other PRs. I made the PRs in my forked repo so the changes made by each can be easily observed without creating additional PRs within the llvm-project/llvm repo.

Patch is 26.74 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106306.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/Coroutines/CMakeLists.txt (+1)
  • (modified) llvm/lib/Transforms/Coroutines/CoroFrame.cpp (+12-310)
  • (added) llvm/lib/Transforms/Coroutines/SuspendCrossingInfo.cpp (+195)
  • (added) llvm/lib/Transforms/Coroutines/SuspendCrossingInfo.h (+182)
diff --git a/llvm/lib/Transforms/Coroutines/CMakeLists.txt b/llvm/lib/Transforms/Coroutines/CMakeLists.txt
index 2139446e5ff957..57359acc81ee4c 100644
--- a/llvm/lib/Transforms/Coroutines/CMakeLists.txt
+++ b/llvm/lib/Transforms/Coroutines/CMakeLists.txt
@@ -6,6 +6,7 @@ add_llvm_component_library(LLVMCoroutines
   CoroElide.cpp
   CoroFrame.cpp
   CoroSplit.cpp
+  SuspendCrossingInfo.cpp
 
   ADDITIONAL_HEADER_DIRS
   ${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms/Coroutines
diff --git a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
index f76cfe01b58cfd..7a631a646f3f8b 100644
--- a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
@@ -16,6 +16,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "CoroInternal.h"
+#include "SuspendCrossingInfo.h"
 #include "llvm/ADT/BitVector.h"
 #include "llvm/ADT/PostOrderIterator.h"
 #include "llvm/ADT/ScopeExit.h"
@@ -51,315 +52,6 @@ extern cl::opt<bool> UseNewDbgInfoFormat;
 // "coro-frame", which results in leaner debug spew.
 #define DEBUG_TYPE "coro-suspend-crossing"
 
-enum { SmallVectorThreshold = 32 };
-
-// Provides two way mapping between the blocks and numbers.
-namespace {
-class BlockToIndexMapping {
-  SmallVector<BasicBlock *, SmallVectorThreshold> V;
-
-public:
-  size_t size() const { return V.size(); }
-
-  BlockToIndexMapping(Function &F) {
-    for (BasicBlock &BB : F)
-      V.push_back(&BB);
-    llvm::sort(V);
-  }
-
-  size_t blockToIndex(BasicBlock const *BB) const {
-    auto *I = llvm::lower_bound(V, BB);
-    assert(I != V.end() && *I == BB && "BasicBlockNumberng: Unknown block");
-    return I - V.begin();
-  }
-
-  BasicBlock *indexToBlock(unsigned Index) const { return V[Index]; }
-};
-} // end anonymous namespace
-
-// The SuspendCrossingInfo maintains data that allows to answer a question
-// whether given two BasicBlocks A and B there is a path from A to B that
-// passes through a suspend point.
-//
-// For every basic block 'i' it maintains a BlockData that consists of:
-//   Consumes:  a bit vector which contains a set of indices of blocks that can
-//              reach block 'i'. A block can trivially reach itself.
-//   Kills: a bit vector which contains a set of indices of blocks that can
-//          reach block 'i' but there is a path crossing a suspend point
-//          not repeating 'i' (path to 'i' without cycles containing 'i').
-//   Suspend: a boolean indicating whether block 'i' contains a suspend point.
-//   End: a boolean indicating whether block 'i' contains a coro.end intrinsic.
-//   KillLoop: There is a path from 'i' to 'i' not otherwise repeating 'i' that
-//             crosses a suspend point.
-//
-namespace {
-class SuspendCrossingInfo {
-  BlockToIndexMapping Mapping;
-
-  struct BlockData {
-    BitVector Consumes;
-    BitVector Kills;
-    bool Suspend = false;
-    bool End = false;
-    bool KillLoop = false;
-    bool Changed = false;
-  };
-  SmallVector<BlockData, SmallVectorThreshold> Block;
-
-  iterator_range<pred_iterator> predecessors(BlockData const &BD) const {
-    BasicBlock *BB = Mapping.indexToBlock(&BD - &Block[0]);
-    return llvm::predecessors(BB);
-  }
-
-  BlockData &getBlockData(BasicBlock *BB) {
-    return Block[Mapping.blockToIndex(BB)];
-  }
-
-  /// Compute the BlockData for the current function in one iteration.
-  /// Initialize - Whether this is the first iteration, we can optimize
-  /// the initial case a little bit by manual loop switch.
-  /// Returns whether the BlockData changes in this iteration.
-  template <bool Initialize = false>
-  bool computeBlockData(const ReversePostOrderTraversal<Function *> &RPOT);
-
-public:
-#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
-  void dump() const;
-  void dump(StringRef Label, BitVector const &BV,
-            const ReversePostOrderTraversal<Function *> &RPOT) const;
-#endif
-
-  SuspendCrossingInfo(Function &F, coro::Shape &Shape);
-
-  /// Returns true if there is a path from \p From to \p To crossing a suspend
-  /// point without crossing \p From a 2nd time.
-  bool hasPathCrossingSuspendPoint(BasicBlock *From, BasicBlock *To) const {
-    size_t const FromIndex = Mapping.blockToIndex(From);
-    size_t const ToIndex = Mapping.blockToIndex(To);
-    bool const Result = Block[ToIndex].Kills[FromIndex];
-    LLVM_DEBUG(dbgs() << From->getName() << " => " << To->getName()
-                      << " answer is " << Result << "\n");
-    return Result;
-  }
-
-  /// Returns true if there is a path from \p From to \p To crossing a suspend
-  /// point without crossing \p From a 2nd time. If \p From is the same as \p To
-  /// this will also check if there is a looping path crossing a suspend point.
-  bool hasPathOrLoopCrossingSuspendPoint(BasicBlock *From,
-                                         BasicBlock *To) const {
-    size_t const FromIndex = Mapping.blockToIndex(From);
-    size_t const ToIndex = Mapping.blockToIndex(To);
-    bool Result = Block[ToIndex].Kills[FromIndex] ||
-                  (From == To && Block[ToIndex].KillLoop);
-    LLVM_DEBUG(dbgs() << From->getName() << " => " << To->getName()
-                      << " answer is " << Result << " (path or loop)\n");
-    return Result;
-  }
-
-  bool isDefinitionAcrossSuspend(BasicBlock *DefBB, User *U) const {
-    auto *I = cast<Instruction>(U);
-
-    // We rewrote PHINodes, so that only the ones with exactly one incoming
-    // value need to be analyzed.
-    if (auto *PN = dyn_cast<PHINode>(I))
-      if (PN->getNumIncomingValues() > 1)
-        return false;
-
-    BasicBlock *UseBB = I->getParent();
-
-    // As a special case, treat uses by an llvm.coro.suspend.retcon or an
-    // llvm.coro.suspend.async as if they were uses in the suspend's single
-    // predecessor: the uses conceptually occur before the suspend.
-    if (isa<CoroSuspendRetconInst>(I) || isa<CoroSuspendAsyncInst>(I)) {
-      UseBB = UseBB->getSinglePredecessor();
-      assert(UseBB && "should have split coro.suspend into its own block");
-    }
-
-    return hasPathCrossingSuspendPoint(DefBB, UseBB);
-  }
-
-  bool isDefinitionAcrossSuspend(Argument &A, User *U) const {
-    return isDefinitionAcrossSuspend(&A.getParent()->getEntryBlock(), U);
-  }
-
-  bool isDefinitionAcrossSuspend(Instruction &I, User *U) const {
-    auto *DefBB = I.getParent();
-
-    // As a special case, treat values produced by an llvm.coro.suspend.*
-    // as if they were defined in the single successor: the uses
-    // conceptually occur after the suspend.
-    if (isa<AnyCoroSuspendInst>(I)) {
-      DefBB = DefBB->getSingleSuccessor();
-      assert(DefBB && "should have split coro.suspend into its own block");
-    }
-
-    return isDefinitionAcrossSuspend(DefBB, U);
-  }
-
-  bool isDefinitionAcrossSuspend(Value &V, User *U) const {
-    if (auto *Arg = dyn_cast<Argument>(&V))
-      return isDefinitionAcrossSuspend(*Arg, U);
-    if (auto *Inst = dyn_cast<Instruction>(&V))
-      return isDefinitionAcrossSuspend(*Inst, U);
-
-    llvm_unreachable(
-        "Coroutine could only collect Argument and Instruction now.");
-  }
-};
-} // end anonymous namespace
-
-#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
-static std::string getBasicBlockLabel(const BasicBlock *BB) {
-  if (BB->hasName())
-    return BB->getName().str();
-
-  std::string S;
-  raw_string_ostream OS(S);
-  BB->printAsOperand(OS, false);
-  return OS.str().substr(1);
-}
-
-LLVM_DUMP_METHOD void SuspendCrossingInfo::dump(
-    StringRef Label, BitVector const &BV,
-    const ReversePostOrderTraversal<Function *> &RPOT) const {
-  dbgs() << Label << ":";
-  for (const BasicBlock *BB : RPOT) {
-    auto BBNo = Mapping.blockToIndex(BB);
-    if (BV[BBNo])
-      dbgs() << " " << getBasicBlockLabel(BB);
-  }
-  dbgs() << "\n";
-}
-
-LLVM_DUMP_METHOD void SuspendCrossingInfo::dump() const {
-  if (Block.empty())
-    return;
-
-  BasicBlock *const B = Mapping.indexToBlock(0);
-  Function *F = B->getParent();
-
-  ReversePostOrderTraversal<Function *> RPOT(F);
-  for (const BasicBlock *BB : RPOT) {
-    auto BBNo = Mapping.blockToIndex(BB);
-    dbgs() << getBasicBlockLabel(BB) << ":\n";
-    dump("   Consumes", Block[BBNo].Consumes, RPOT);
-    dump("      Kills", Block[BBNo].Kills, RPOT);
-  }
-  dbgs() << "\n";
-}
-#endif
-
-template <bool Initialize>
-bool SuspendCrossingInfo::computeBlockData(
-    const ReversePostOrderTraversal<Function *> &RPOT) {
-  bool Changed = false;
-
-  for (const BasicBlock *BB : RPOT) {
-    auto BBNo = Mapping.blockToIndex(BB);
-    auto &B = Block[BBNo];
-
-    // We don't need to count the predecessors when initialization.
-    if constexpr (!Initialize)
-      // If all the predecessors of the current Block don't change,
-      // the BlockData for the current block must not change too.
-      if (all_of(predecessors(B), [this](BasicBlock *BB) {
-            return !Block[Mapping.blockToIndex(BB)].Changed;
-          })) {
-        B.Changed = false;
-        continue;
-      }
-
-    // Saved Consumes and Kills bitsets so that it is easy to see
-    // if anything changed after propagation.
-    auto SavedConsumes = B.Consumes;
-    auto SavedKills = B.Kills;
-
-    for (BasicBlock *PI : predecessors(B)) {
-      auto PrevNo = Mapping.blockToIndex(PI);
-      auto &P = Block[PrevNo];
-
-      // Propagate Kills and Consumes from predecessors into B.
-      B.Consumes |= P.Consumes;
-      B.Kills |= P.Kills;
-
-      // If block P is a suspend block, it should propagate kills into block
-      // B for every block P consumes.
-      if (P.Suspend)
-        B.Kills |= P.Consumes;
-    }
-
-    if (B.Suspend) {
-      // If block B is a suspend block, it should kill all of the blocks it
-      // consumes.
-      B.Kills |= B.Consumes;
-    } else if (B.End) {
-      // If block B is an end block, it should not propagate kills as the
-      // blocks following coro.end() are reached during initial invocation
-      // of the coroutine while all the data are still available on the
-      // stack or in the registers.
-      B.Kills.reset();
-    } else {
-      // This is reached when B block it not Suspend nor coro.end and it
-      // need to make sure that it is not in the kill set.
-      B.KillLoop |= B.Kills[BBNo];
-      B.Kills.reset(BBNo);
-    }
-
-    if constexpr (!Initialize) {
-      B.Changed = (B.Kills != SavedKills) || (B.Consumes != SavedConsumes);
-      Changed |= B.Changed;
-    }
-  }
-
-  return Changed;
-}
-
-SuspendCrossingInfo::SuspendCrossingInfo(Function &F, coro::Shape &Shape)
-    : Mapping(F) {
-  const size_t N = Mapping.size();
-  Block.resize(N);
-
-  // Initialize every block so that it consumes itself
-  for (size_t I = 0; I < N; ++I) {
-    auto &B = Block[I];
-    B.Consumes.resize(N);
-    B.Kills.resize(N);
-    B.Consumes.set(I);
-    B.Changed = true;
-  }
-
-  // Mark all CoroEnd Blocks. We do not propagate Kills beyond coro.ends as
-  // the code beyond coro.end is reachable during initial invocation of the
-  // coroutine.
-  for (auto *CE : Shape.CoroEnds)
-    getBlockData(CE->getParent()).End = true;
-
-  // Mark all suspend blocks and indicate that they kill everything they
-  // consume. Note, that crossing coro.save also requires a spill, as any code
-  // between coro.save and coro.suspend may resume the coroutine and all of the
-  // state needs to be saved by that time.
-  auto markSuspendBlock = [&](IntrinsicInst *BarrierInst) {
-    BasicBlock *SuspendBlock = BarrierInst->getParent();
-    auto &B = getBlockData(SuspendBlock);
-    B.Suspend = true;
-    B.Kills |= B.Consumes;
-  };
-  for (auto *CSI : Shape.CoroSuspends) {
-    markSuspendBlock(CSI);
-    if (auto *Save = CSI->getCoroSave())
-      markSuspendBlock(Save);
-  }
-
-  // It is considered to be faster to use RPO traversal for forward-edges
-  // dataflow analysis.
-  ReversePostOrderTraversal<Function *> RPOT(&F);
-  computeBlockData</*Initialize=*/true>(RPOT);
-  while (computeBlockData</*Initialize*/ false>(RPOT))
-    ;
-
-  LLVM_DEBUG(dump());
-}
-
 namespace {
 
 // RematGraph is used to construct a DAG for rematerializable instructions
@@ -438,6 +130,16 @@ struct RematGraph {
   }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  static std::string getBasicBlockLabel(const BasicBlock *BB) {
+    if (BB->hasName())
+      return BB->getName().str();
+
+    std::string S;
+    raw_string_ostream OS(S);
+    BB->printAsOperand(OS, false);
+    return OS.str().substr(1);
+  }
+
   void dump() const {
     dbgs() << "Entry (";
     dbgs() << getBasicBlockLabel(EntryNode->Node->getParent());
@@ -3159,7 +2861,7 @@ void coro::buildCoroutineFrame(
   rewritePHIs(F);
 
   // Build suspend crossing info.
-  SuspendCrossingInfo Checker(F, Shape);
+  SuspendCrossingInfo Checker(F, Shape.CoroSuspends, Shape.CoroEnds);
 
   doRematerializations(F, Checker, MaterializableCallback);
 
diff --git a/llvm/lib/Transforms/Coroutines/SuspendCrossingInfo.cpp b/llvm/lib/Transforms/Coroutines/SuspendCrossingInfo.cpp
new file mode 100644
index 00000000000000..ff3b32e958edac
--- /dev/null
+++ b/llvm/lib/Transforms/Coroutines/SuspendCrossingInfo.cpp
@@ -0,0 +1,195 @@
+//===- SuspendCrossingInfo.cpp - Utility for suspend crossing values ------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+// The SuspendCrossingInfo maintains data that allows to answer a question
+// whether given two BasicBlocks A and B there is a path from A to B that
+// passes through a suspend point.
+//===----------------------------------------------------------------------===//
+
+#include "SuspendCrossingInfo.h"
+
+// The "coro-suspend-crossing" flag is very noisy. There is another debug type,
+// "coro-frame", which results in leaner debug spew.
+#define DEBUG_TYPE "coro-suspend-crossing"
+
+namespace llvm {
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+static std::string getBasicBlockLabel(const BasicBlock *BB) {
+  if (BB->hasName())
+    return BB->getName().str();
+
+  std::string S;
+  raw_string_ostream OS(S);
+  BB->printAsOperand(OS, false);
+  return OS.str().substr(1);
+}
+
+LLVM_DUMP_METHOD void SuspendCrossingInfo::dump(
+    StringRef Label, BitVector const &BV,
+    const ReversePostOrderTraversal<Function *> &RPOT) const {
+  dbgs() << Label << ":";
+  for (const BasicBlock *BB : RPOT) {
+    auto BBNo = Mapping.blockToIndex(BB);
+    if (BV[BBNo])
+      dbgs() << " " << getBasicBlockLabel(BB);
+  }
+  dbgs() << "\n";
+}
+
+LLVM_DUMP_METHOD void SuspendCrossingInfo::dump() const {
+  if (Block.empty())
+    return;
+
+  BasicBlock *const B = Mapping.indexToBlock(0);
+  Function *F = B->getParent();
+
+  ReversePostOrderTraversal<Function *> RPOT(F);
+  for (const BasicBlock *BB : RPOT) {
+    auto BBNo = Mapping.blockToIndex(BB);
+    dbgs() << getBasicBlockLabel(BB) << ":\n";
+    dump("   Consumes", Block[BBNo].Consumes, RPOT);
+    dump("      Kills", Block[BBNo].Kills, RPOT);
+  }
+  dbgs() << "\n";
+}
+#endif
+
+bool SuspendCrossingInfo::hasPathCrossingSuspendPoint(BasicBlock *From,
+                                                      BasicBlock *To) const {
+  size_t const FromIndex = Mapping.blockToIndex(From);
+  size_t const ToIndex = Mapping.blockToIndex(To);
+  bool const Result = Block[ToIndex].Kills[FromIndex];
+  LLVM_DEBUG(if (Result) dbgs() << From->getName() << " => " << To->getName()
+                                << " crosses suspend point\n");
+  return Result;
+}
+
+bool SuspendCrossingInfo::hasPathOrLoopCrossingSuspendPoint(
+    BasicBlock *From, BasicBlock *To) const {
+  size_t const FromIndex = Mapping.blockToIndex(From);
+  size_t const ToIndex = Mapping.blockToIndex(To);
+  bool Result = Block[ToIndex].Kills[FromIndex] ||
+                (From == To && Block[ToIndex].KillLoop);
+  LLVM_DEBUG(if (Result) dbgs() << From->getName() << " => " << To->getName()
+                                << " crosses suspend point (path or loop)\n");
+  return Result;
+}
+
+template <bool Initialize>
+bool SuspendCrossingInfo::computeBlockData(
+    const ReversePostOrderTraversal<Function *> &RPOT) {
+  bool Changed = false;
+
+  for (const BasicBlock *BB : RPOT) {
+    auto BBNo = Mapping.blockToIndex(BB);
+    auto &B = Block[BBNo];
+
+    // We don't need to count the predecessors when initialization.
+    if constexpr (!Initialize)
+      // If all the predecessors of the current Block don't change,
+      // the BlockData for the current block must not change too.
+      if (all_of(predecessors(B), [this](BasicBlock *BB) {
+            return !Block[Mapping.blockToIndex(BB)].Changed;
+          })) {
+        B.Changed = false;
+        continue;
+      }
+
+    // Saved Consumes and Kills bitsets so that it is easy to see
+    // if anything changed after propagation.
+    auto SavedConsumes = B.Consumes;
+    auto SavedKills = B.Kills;
+
+    for (BasicBlock *PI : predecessors(B)) {
+      auto PrevNo = Mapping.blockToIndex(PI);
+      auto &P = Block[PrevNo];
+
+      // Propagate Kills and Consumes from predecessors into B.
+      B.Consumes |= P.Consumes;
+      B.Kills |= P.Kills;
+
+      // If block P is a suspend block, it should propagate kills into block
+      // B for every block P consumes.
+      if (P.Suspend)
+        B.Kills |= P.Consumes;
+    }
+
+    if (B.Suspend) {
+      // If block B is a suspend block, it should kill all of the blocks it
+      // consumes.
+      B.Kills |= B.Consumes;
+    } else if (B.End) {
+      // If block B is an end block, it should not propagate kills as the
+      // blocks following coro.end() are reached during initial invocation
+      // of the coroutine while all the data are still available on the
+      // stack or in the registers.
+      B.Kills.reset();
+    } else {
+      // This is reached when B block it not Suspend nor coro.end and it
+      // need to make sure that it is not in the kill set.
+      B.KillLoop |= B.Kills[BBNo];
+      B.Kills.reset(BBNo);
+    }
+
+    if constexpr (!Initialize) {
+      B.Changed = (B.Kills != SavedKills) || (B.Consumes != SavedConsumes);
+      Changed |= B.Changed;
+    }
+  }
+
+  return Changed;
+}
+
+SuspendCrossingInfo::SuspendCrossingInfo(
+    Function &F, const SmallVectorImpl<AnyCoroSuspendInst *> &CoroSuspends,
+    const SmallVectorImpl<AnyCoroEndInst *> &CoroEnds)
+    : Mapping(F) {
+  const size_t N = Mapping.size();
+  Block.resize(N);
+
+  // Initialize every block so that it consumes itself
+  for (size_t I = 0; I < N; ++I) {
+    auto &B = Block[I];
+    B.Consumes.resize(N);
+    B.Kills.resize(N);
+    B.Consumes.set(I);
+    B.Changed = true;
+  }
+
+  // Mark all CoroEnd Blocks. We do not propagate Kills beyond coro.ends as
+  // the code beyond coro.end is reachable during initial invocation of the
+  // coroutine.
+  for (auto *CE : CoroEnds)
+    getBlockData(CE->getParent()).End = true;
+
+  // Mark all suspend blocks and indicate that they kill everything they
+  // consume. Note, that crossing coro.save also requires a spill, as any code
+  // between coro.save and coro.suspend may resume the coroutine and all of the
+  // state needs to be saved by that time.
+  auto markSuspendBlock = [&](IntrinsicInst *BarrierInst) {
+    BasicBlock *SuspendBlock = BarrierInst->getParent();
+    auto &B = getBlockData(SuspendBlock);
+    B.Suspend = true;
+    B.Kills |= B.Consumes;
+  };
+  for (auto *CSI : CoroSuspends) {
+    markSuspendBlock(CSI);
+    if (auto *Save = CSI->getCoroSave())
+      markSuspendBlock(Save);
+  }
+
+  // It is considered to be faster to use RPO traversal for forward-edges
+  // dataflow analysis.
+  ReversePostOrderTraversal<Function *> RPOT(&F);
+  computeBlockData</*Initialize=*/true>(RPOT);
+  while (computeBlockData</*Initialize*/ false>(RPOT))
+    ;
+
+  LLVM_DEBUG(dump());
+}
+
+} // namespace llvm
diff --git a/llvm/lib/Transforms/Coroutines/SuspendCrossingInfo.h b/llvm/lib/Transforms/Coroutines/SuspendCrossingInfo.h
new file mode 100644
index 00000000000000..d1551af0b54975
--- /dev/null
+++ b/llvm/lib/Transforms/Coroutines/SuspendCrossingInfo.h
@@ -0,0 +1,182 @@
+//===- SuspendCrossingInfo.cpp - Utility for suspend cros...
[truncated]

@ChuanqiXu9
Copy link
Member

This patch looks good. And the proposal looks good generally (it is long and maybe I missed some details).

But the problem is.. it looks like the contents of the PR is less related to the message body. It may be confusing. And also, I think such proposal should be sent to https://discourse.llvm.org/ to get a wider visibility.

BlockToIndexMapping(Function &F) {
for (BasicBlock &BB : F)
V.push_back(&BB);
llvm::sort(V);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this sorting by pointer value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we can get the index of the BB by lower_bound. Otherwise we have to introduce a map here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the block index is already available in BB.getNumber()?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know this. Could we get the BB back from the number? If yes, this is an improvement.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can, but it requires chasing the block list. Is this used in a way that introduces runtime pointer value dependence?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can, but it requires chasing the block list.

So we need a linear search? Then it is not wanted.

Is this used in a way that introduces runtime pointer value dependence?

I am not sure I understand your points 100%. But this is only used to have a simple way to offer a two way map from BB and indexes. Then we can have a bitmap for the BB. And this is only used as an analysis in CoroSplit pass. So it at least won't introduce dependence on runtime pointer value across passes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick check for the use of this info in CoroSplit pass. And if I didn't miss a point, we won't create or remove BB during the use of the information. But I admit it is dangerous indeed.

Copy link
Collaborator Author

@TylerNowicki TylerNowicki Sep 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very interesting, I didn't notice this while reading the code earlier. The restriction that we cannot add/remove BB while using SuspendCrossingInfo is not clearly documented. I can add a comment for this (see the header desc for the comment).

I suppose if people don't mind using a map we can create one for BB.getNumber() -> BB*, but that is for another PR. Why is it a problem to use a map here?

@TylerNowicki
Copy link
Collaborator Author

TylerNowicki commented Sep 4, 2024

This patch looks good. And the proposal looks good generally (it is long and maybe I missed some details).

But the problem is.. it looks like the contents of the PR is less related to the message body. It may be confusing. And also, I think such proposal should be sent to https://discourse.llvm.org/ to get a wider visibility.

Sorry I may not have followed the BKM here for presenting the changes. If you look at the 'Proposed PRs' part of the proposal it has a list of PRs. I split it up into separate PRs because one giant PR would not be easy to review. The first PR is from my branch (A) in my forked repo to llvm's main. The second PR is from another branch in my forked repo (B) to branch (A). And so on... So once branch (A) is merged to llvm's main then I will update the PR for (B) to point to llvm's main and that can be merged. This should allow each PR's changes to be reviewed right now without merging any PRs. Can you recommend another method to present the changes (changes that build on each other one to the next) so they can be easily reviewed?

@ChuanqiXu9
Copy link
Member

This patch looks good. And the proposal looks good generally (it is long and maybe I missed some details).
But the problem is.. it looks like the contents of the PR is less related to the message body. It may be confusing. And also, I think such proposal should be sent to https://discourse.llvm.org/ to get a wider visibility.

Sorry I may not have followed the BKM here for presenting the changes. If you look at the 'Proposed PRs' part of the proposal it has a list of PRs. I split it up into separate PRs because one giant PR would not be easy to review. The first PR is from my branch (A) in my forked repo to llvm's main. The second PR is from another branch in my forked repo (B) to branch (A). And so on... So once branch (A) is merged to llvm's main then I will update the PR for (B) to point to llvm's main and that can be merged. This should allow each PR's changes to be reviewed right now without merging any PRs. Can you recommend another method to present the changes (changes that build on each other one to the next) so they can be easily reviewed?

No, this is not my points. I feel good with your method to split the PRs. And I feel good with the codes of the current PR and the proposal itself. But the problem is, I don't think such proposal should be presented here in a PR. But it should be sent to https://discourse.llvm.org/ to get more visibility. And for the message body of the current PR, it would be better to describe the changes actually.

Currently the problem is, the message body of the PR is less related to the changes in the PR.

@TylerNowicki
Copy link
Collaborator Author

This patch looks good. And the proposal looks good generally (it is long and maybe I missed some details).
But the problem is.. it looks like the contents of the PR is less related to the message body. It may be confusing. And also, I think such proposal should be sent to https://discourse.llvm.org/ to get a wider visibility.

Sorry I may not have followed the BKM here for presenting the changes. If you look at the 'Proposed PRs' part of the proposal it has a list of PRs. I split it up into separate PRs because one giant PR would not be easy to review. The first PR is from my branch (A) in my forked repo to llvm's main. The second PR is from another branch in my forked repo (B) to branch (A). And so on... So once branch (A) is merged to llvm's main then I will update the PR for (B) to point to llvm's main and that can be merged. This should allow each PR's changes to be reviewed right now without merging any PRs. Can you recommend another method to present the changes (changes that build on each other one to the next) so they can be easily reviewed?

No, this is not my points. I feel good with your method to split the PRs. And I feel good with the codes of the current PR and the proposal itself. But the problem is, I don't think such proposal should be presented here in a PR. But it should be sent to https://discourse.llvm.org/ to get more visibility. And for the message body of the current PR, it would be better to describe the changes actually.

Currently the problem is, the message body of the PR is less related to the changes in the PR.

Thanks I will create an RFC there and update this PR message.

@TylerNowicki
Copy link
Collaborator Author

This patch looks good. And the proposal looks good generally (it is long and maybe I missed some details).
But the problem is.. it looks like the contents of the PR is less related to the message body. It may be confusing. And also, I think such proposal should be sent to https://discourse.llvm.org/ to get a wider visibility.

Sorry I may not have followed the BKM here for presenting the changes. If you look at the 'Proposed PRs' part of the proposal it has a list of PRs. I split it up into separate PRs because one giant PR would not be easy to review. The first PR is from my branch (A) in my forked repo to llvm's main. The second PR is from another branch in my forked repo (B) to branch (A). And so on... So once branch (A) is merged to llvm's main then I will update the PR for (B) to point to llvm's main and that can be merged. This should allow each PR's changes to be reviewed right now without merging any PRs. Can you recommend another method to present the changes (changes that build on each other one to the next) so they can be easily reviewed?

No, this is not my points. I feel good with your method to split the PRs. And I feel good with the codes of the current PR and the proposal itself. But the problem is, I don't think such proposal should be presented here in a PR. But it should be sent to https://discourse.llvm.org/ to get more visibility. And for the message body of the current PR, it would be better to describe the changes actually.
Currently the problem is, the message body of the PR is less related to the changes in the PR.

Thanks I will create an RFC there and update this PR message.

Done. See RFC here: https://discourse.llvm.org/t/rfc-abi-objects-for-coroutines/81057

@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor branch 2 times, most recently from 7563efd to 6546d68 Compare September 4, 2024 15:25
… helper into its own header/source

* Move SuspendCrossingInfo to its own files to clean up CoroFrame
@TylerNowicki TylerNowicki changed the title [llvm/llvm-project][Coroutines] ABI Object [llvm/llvm-project][Coroutines] Move the SuspendCrossingInfo analysis helper into its own header/source Sep 4, 2024
@TylerNowicki TylerNowicki force-pushed the users/tylernowicki/coro-refactor branch from 6546d68 to 1186cd8 Compare September 4, 2024 15:27
@TylerNowicki
Copy link
Collaborator Author

This patch looks good. And the proposal looks good generally (it is long and maybe I missed some details).

But the problem is.. it looks like the contents of the PR is less related to the message body. It may be confusing. And also, I think such proposal should be sent to https://discourse.llvm.org/ to get a wider visibility.

Can I take this as a lgtm with the changes to the PR and RFC?

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

nit: please remove the prefix '[llvm/llvm-project]' in the title when committing.

@TylerNowicki TylerNowicki merged commit ea2da57 into llvm:main Sep 9, 2024
8 checks passed
@TylerNowicki TylerNowicki changed the title [llvm/llvm-project][Coroutines] Move the SuspendCrossingInfo analysis helper into its own header/source [Coroutines] Move the SuspendCrossingInfo analysis helper into its own header/source Sep 9, 2024
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 9, 2024

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/1111

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/43/86' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-21984-43-86.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=86 GTEST_SHARD_INDEX=43 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants