Skip to content

[CGSCC] Fix compile time blowup with large RefSCCs #94815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions llvm/include/llvm/Analysis/CGSCCPassManager.h
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,10 @@ struct CGSCCUpdateResult {
SmallDenseSet<std::pair<LazyCallGraph::Node *, LazyCallGraph::SCC *>, 4>
&InlinedInternalEdges;

/// Functions that a pass has considered to be dead to be removed at the end
/// of the call graph walk in batch.
SmallVector<Function *, 4> &DeadFunctions;

/// Weak VHs to keep track of indirect calls for the purposes of detecting
/// devirtualization.
///
Expand Down
22 changes: 11 additions & 11 deletions llvm/include/llvm/Analysis/LazyCallGraph.h
Original file line number Diff line number Diff line change
Expand Up @@ -832,7 +832,7 @@ class LazyCallGraph {
/// self-edges and edge removals which result in a spanning tree with no
/// more cycles.
[[nodiscard]] SmallVector<RefSCC *, 1>
removeInternalRefEdge(Node &SourceN, ArrayRef<Node *> TargetNs);
removeInternalRefEdges(ArrayRef<std::pair<Node *, Node *>> Edges);

/// A convenience wrapper around the above to handle trivial cases of
/// inserting a new call edge.
Expand Down Expand Up @@ -1056,18 +1056,18 @@ class LazyCallGraph {
/// once SCCs have started to be formed. These routines have strict contracts
/// but may be called at any point.

/// Remove a dead function from the call graph (typically to delete it).
/// Remove dead functions from the call graph.
///
/// Note that the function must have an empty use list, and the call graph
/// must be up-to-date prior to calling this. That means it is by itself in
/// a maximal SCC which is by itself in a maximal RefSCC, etc. No structural
/// changes result from calling this routine other than potentially removing
/// entry points into the call graph.
/// These functions should have already been passed to markDeadFunction().
/// This is done as a batch to prevent compile time blowup as a result of
/// handling a single function at a time.
void removeDeadFunctions(ArrayRef<Function *> DeadFs);

/// Mark a function as dead to be removed later by removeDeadFunctions().
///
/// If SCC formation has begun, this function must not be part of the current
/// DFS in order to call this safely. Typically, the function will have been
/// fully visited by the DFS prior to calling this routine.
void removeDeadFunction(Function &F);
/// The function body should have no incoming or outgoing call or ref edges.
/// For example, a function with a single "unreachable" instruction.
void markDeadFunction(Function &F);

/// Add a new function split/outlined from an existing function.
///
Expand Down
42 changes: 9 additions & 33 deletions llvm/lib/Analysis/CGSCCPassManager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -158,10 +158,12 @@ ModuleToPostOrderCGSCCPassAdaptor::run(Module &M, ModuleAnalysisManager &AM) {
SmallDenseSet<std::pair<LazyCallGraph::Node *, LazyCallGraph::SCC *>, 4>
InlinedInternalEdges;

SmallVector<Function *, 4> DeadFunctions;

CGSCCUpdateResult UR = {
RCWorklist, CWorklist, InvalidRefSCCSet,
InvalidSCCSet, nullptr, PreservedAnalyses::all(),
InlinedInternalEdges, {}};
RCWorklist, CWorklist, InvalidRefSCCSet,
InvalidSCCSet, nullptr, PreservedAnalyses::all(),
InlinedInternalEdges, DeadFunctions, {}};

// Request PassInstrumentation from analysis manager, will use it to run
// instrumenting callbacks for the passes later.
Expand Down Expand Up @@ -340,6 +342,10 @@ ModuleToPostOrderCGSCCPassAdaptor::run(Module &M, ModuleAnalysisManager &AM) {
} while (!RCWorklist.empty());
}

CG.removeDeadFunctions(DeadFunctions);
for (Function *DeadF : DeadFunctions)
DeadF->eraseFromParent();

#if defined(EXPENSIVE_CHECKS)
// Verify that the call graph is still valid.
CG.verify();
Expand Down Expand Up @@ -1030,36 +1036,6 @@ static LazyCallGraph::SCC &updateCGAndAnalysisManagerForPass(
return true;
});

// Now do a batch removal of the internal ref edges left.
auto NewRefSCCs = RC->removeInternalRefEdge(N, DeadTargets);
if (!NewRefSCCs.empty()) {
// The old RefSCC is dead, mark it as such.
UR.InvalidatedRefSCCs.insert(RC);

// Note that we don't bother to invalidate analyses as ref-edge
// connectivity is not really observable in any way and is intended
// exclusively to be used for ordering of transforms rather than for
// analysis conclusions.

// Update RC to the "bottom".
assert(G.lookupSCC(N) == C && "Changed the SCC when splitting RefSCCs!");
RC = &C->getOuterRefSCC();
assert(G.lookupRefSCC(N) == RC && "Failed to update current RefSCC!");

// The RC worklist is in reverse postorder, so we enqueue the new ones in
// RPO except for the one which contains the source node as that is the
// "bottom" we will continue processing in the bottom-up walk.
assert(NewRefSCCs.front() == RC &&
"New current RefSCC not first in the returned list!");
for (RefSCC *NewRC : llvm::reverse(llvm::drop_begin(NewRefSCCs))) {
assert(NewRC != RC && "Should not encounter the current RefSCC further "
"in the postorder list of new RefSCCs.");
UR.RCWorklist.insert(NewRC);
LLVM_DEBUG(dbgs() << "Enqueuing a new RefSCC in the update worklist: "
<< *NewRC << "\n");
}
}

// Next demote all the call edges that are now ref edges. This helps make
// the SCCs small which should minimize the work below as we don't want to
// form cycles that this would break.
Expand Down
123 changes: 64 additions & 59 deletions llvm/lib/Analysis/LazyCallGraph.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1160,8 +1160,8 @@ void LazyCallGraph::RefSCC::removeOutgoingEdge(Node &SourceN, Node &TargetN) {
}

SmallVector<LazyCallGraph::RefSCC *, 1>
LazyCallGraph::RefSCC::removeInternalRefEdge(Node &SourceN,
ArrayRef<Node *> TargetNs) {
LazyCallGraph::RefSCC::removeInternalRefEdges(
ArrayRef<std::pair<Node *, Node *>> Edges) {
// We return a list of the resulting *new* RefSCCs in post-order.
SmallVector<RefSCC *, 1> Result;

Expand All @@ -1179,25 +1179,21 @@ LazyCallGraph::RefSCC::removeInternalRefEdge(Node &SourceN,
#endif

// First remove the actual edges.
for (Node *TargetN : TargetNs) {
assert(!(*SourceN)[*TargetN].isCall() &&
for (auto [SourceN, TargetN] : Edges) {
assert(!(**SourceN)[*TargetN].isCall() &&
"Cannot remove a call edge, it must first be made a ref edge");

bool Removed = SourceN->removeEdgeInternal(*TargetN);
bool Removed = (*SourceN)->removeEdgeInternal(*TargetN);
(void)Removed;
assert(Removed && "Target not in the edge set for this caller?");
}

// Direct self references don't impact the ref graph at all.
if (llvm::all_of(TargetNs,
[&](Node *TargetN) { return &SourceN == TargetN; }))
return Result;

// If all targets are in the same SCC as the source, because no call edges
// were removed there is no RefSCC structure change.
SCC &SourceC = *G->lookupSCC(SourceN);
if (llvm::all_of(TargetNs, [&](Node *TargetN) {
return G->lookupSCC(*TargetN) == &SourceC;
if (llvm::all_of(Edges, [&](std::pair<Node *, Node *> E) {
return E.first == E.second ||
G->lookupSCC(*E.first) == G->lookupSCC(*E.second);
}))
return Result;

Expand Down Expand Up @@ -1499,7 +1495,7 @@ void LazyCallGraph::removeEdge(Node &SourceN, Node &TargetN) {
assert(Removed && "Target not in the edge set for this caller?");
}

void LazyCallGraph::removeDeadFunction(Function &F) {
void LazyCallGraph::markDeadFunction(Function &F) {
// FIXME: This is unnecessarily restrictive. We should be able to remove
// functions which recursively call themselves.
assert(F.hasZeroLiveUses() &&
Expand All @@ -1515,57 +1511,66 @@ void LazyCallGraph::removeDeadFunction(Function &F) {

Node &N = *NI->second;

// Cannot remove a function which has yet to be visited in the DFS walk, so
// if we have a node at all then we must have an SCC and RefSCC.
auto CI = SCCMap.find(&N);
assert(CI != SCCMap.end() &&
"Tried to remove a node without an SCC after DFS walk started!");
SCC &C = *CI->second;
RefSCC *RC = &C.getOuterRefSCC();

// In extremely rare cases, we can delete a dead function which is still in a
// non-trivial RefSCC. This can happen due to spurious ref edges sticking
// around after an IR function reference is removed.
if (RC->size() != 1) {
SmallVector<Node *, 0> NodesInRC;
for (SCC &OtherC : *RC) {
for (Node &OtherN : OtherC)
NodesInRC.push_back(&OtherN);
// Remove all call edges out of dead function.
for (Edge E : *N) {
if (E.isCall())
N->setEdgeKind(E.getNode(), Edge::Ref);
}
}

void LazyCallGraph::removeDeadFunctions(ArrayRef<Function *> DeadFs) {
if (DeadFs.empty())
return;

// Group dead functions by the RefSCC they're in.
DenseMap<RefSCC *, SmallVector<Node *, 1>> RCs;
for (Function *DeadF : DeadFs) {
Node *N = lookup(*DeadF);
#ifndef NDEBUG
for (Edge &E : **N) {
assert(!E.isCall() &&
"dead function shouldn't have any outgoing call edges");
}
for (Node *OtherN : NodesInRC) {
if ((*OtherN)->lookup(N)) {
auto NewRefSCCs =
RC->removeInternalRefEdge(*OtherN, ArrayRef<Node *>(&N));
// If we've split into multiple RefSCCs, RC is now invalid and the
// RefSCC containing C will be different.
if (!NewRefSCCs.empty())
RC = &C.getOuterRefSCC();
#endif
RefSCC *RC = lookupRefSCC(*N);
RCs[RC].push_back(N);
}
// Remove outgoing edges from all dead functions. Dead functions should
// already have had their call edges removed in markDeadFunction(), so we only
// need to worry about spurious ref edges.
for (auto [RC, DeadNs] : RCs) {
SmallVector<std::pair<Node *, Node *>> InternalEdgesToRemove;
for (Node *DeadN : DeadNs) {
for (Edge &E : **DeadN) {
if (lookupRefSCC(E.getNode()) == RC)
InternalEdgesToRemove.push_back({DeadN, &E.getNode()});
else
RC->removeOutgoingEdge(*DeadN, E.getNode());
}
}
// We ignore the returned RefSCCs since at this point we're done with CGSCC
// iteration and don't need to add it to any worklists.
(void)RC->removeInternalRefEdges(InternalEdgesToRemove);
for (Node *DeadN : DeadNs) {
RefSCC *DeadRC = lookupRefSCC(*DeadN);
assert(DeadRC->size() == 1);
assert(DeadRC->begin()->size() == 1);
DeadRC->clear();
DeadRC->G = nullptr;
}
}
// Clean up data structures.
for (Function *DeadF : DeadFs) {
Node &N = *lookup(*DeadF);

EntryEdges.removeEdgeInternal(N);
SCCMap.erase(SCCMap.find(&N));
NodeMap.erase(NodeMap.find(DeadF));

NodeMap.erase(NI);
EntryEdges.removeEdgeInternal(N);
SCCMap.erase(CI);

// This node must be the only member of its SCC as it has no callers, and
// that SCC must be the only member of a RefSCC as it has no references.
// Validate these properties first.
assert(C.size() == 1 && "Dead functions must be in a singular SCC");
assert(RC->size() == 1 && "Dead functions must be in a singular RefSCC");

// Finally clear out all the data structures from the node down through the
// components. postorder_ref_scc_iterator will skip empty RefSCCs, so no need
// to adjust LazyCallGraph data structures.
N.clear();
N.G = nullptr;
N.F = nullptr;
C.clear();
RC->clear();
RC->G = nullptr;

// Nothing to delete as all the objects are allocated in stable bump pointer
// allocators.
N.clear();
N.G = nullptr;
N.F = nullptr;
}
}

// Gets the Edge::Kind from one function to another by looking at the function's
Expand Down
29 changes: 14 additions & 15 deletions llvm/lib/Transforms/IPO/Inliner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,14 @@ InlinerPass::getAdvisor(const ModuleAnalysisManagerCGSCCProxy::Result &MAM,
return *IAA->getAdvisor();
}

void makeFunctionBodyUnreachable(Function &F) {
F.dropAllReferences();
for (BasicBlock &BB : make_early_inc_range(F))
BB.eraseFromParent();
BasicBlock *BB = BasicBlock::Create(F.getContext(), "", &F);
new UnreachableInst(F.getContext(), BB);
}

PreservedAnalyses InlinerPass::run(LazyCallGraph::SCC &InitialC,
CGSCCAnalysisManager &AM, LazyCallGraph &CG,
CGSCCUpdateResult &UR) {
Expand Down Expand Up @@ -448,11 +456,9 @@ PreservedAnalyses InlinerPass::run(LazyCallGraph::SCC &InitialC,
}),
Calls.end());

// Clear the body and queue the function itself for deletion when we
// finish inlining and call graph updates.
// Note that after this point, it is an error to do anything other
// than use the callee's address or delete it.
Callee.dropAllReferences();
// Clear the body and queue the function itself for call graph
// updating when we finish inlining.
makeFunctionBodyUnreachable(Callee);
assert(!is_contained(DeadFunctions, &Callee) &&
"Cannot put cause a function to become dead twice!");
DeadFunctions.push_back(&Callee);
Expand Down Expand Up @@ -530,7 +536,7 @@ PreservedAnalyses InlinerPass::run(LazyCallGraph::SCC &InitialC,
if (!DeadFunctionsInComdats.empty()) {
filterDeadComdatFunctions(DeadFunctionsInComdats);
for (auto *Callee : DeadFunctionsInComdats)
Callee->dropAllReferences();
makeFunctionBodyUnreachable(*Callee);
DeadFunctions.append(DeadFunctionsInComdats);
}

Expand All @@ -542,25 +548,18 @@ PreservedAnalyses InlinerPass::run(LazyCallGraph::SCC &InitialC,
// that is OK as all we do is delete things and add pointers to unordered
// sets.
for (Function *DeadF : DeadFunctions) {
CG.markDeadFunction(*DeadF);
// Get the necessary information out of the call graph and nuke the
// function there. Also, clear out any cached analyses.
auto &DeadC = *CG.lookupSCC(*CG.lookup(*DeadF));
FAM.clear(*DeadF, DeadF->getName());
AM.clear(DeadC, DeadC.getName());
auto &DeadRC = DeadC.getOuterRefSCC();
CG.removeDeadFunction(*DeadF);

// Mark the relevant parts of the call graph as invalid so we don't visit
// them.
UR.InvalidatedSCCs.insert(&DeadC);
UR.InvalidatedRefSCCs.insert(&DeadRC);

// If the updated SCC was the one containing the deleted function, clear it.
if (&DeadC == UR.UpdatedC)
UR.UpdatedC = nullptr;

// And delete the actual function from the module.
M.getFunctionList().erase(DeadF);
UR.DeadFunctions.push_back(DeadF);

++NumDeleted;
}
Expand Down
12 changes: 8 additions & 4 deletions llvm/lib/Transforms/Utils/CallGraphUpdater.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,16 +67,20 @@ bool CallGraphUpdater::finalize() {

FAM.clear(*DeadFn, DeadFn->getName());
AM->clear(*DeadSCC, DeadSCC->getName());
LCG->removeDeadFunction(*DeadFn);
LCG->markDeadFunction(*DeadFn);

// Mark the relevant parts of the call graph as invalid so we don't
// visit them.
UR->InvalidatedSCCs.insert(DeadSCC);
UR->InvalidatedRefSCCs.insert(&DeadRC);
UR->DeadFunctions.push_back(DeadFn);
} else {
// The CGSCC infrastructure batch deletes functions at the end of the
// call graph walk, so only erase the function if we're not using that
// infrastructure.
// The function is now really dead and de-attached from everything.
DeadFn->eraseFromParent();
}

// The function is now really dead and de-attached from everything.
DeadFn->eraseFromParent();
}
}

Expand Down
2 changes: 0 additions & 2 deletions llvm/test/Other/cgscc-refscc-mutation-order.ll
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,6 @@
; CHECK-NOT: InstCombinePass
; CHECK: Running pass: InstCombinePass on f4
; CHECK-NOT: InstCombinePass
; CHECK: Running pass: InstCombinePass on f1
; CHECK-NOT: InstCombinePass

@a1 = alias void (), ptr @f1
@a2 = alias void (), ptr @f2
Expand Down
2 changes: 0 additions & 2 deletions llvm/test/Other/devirt-invalidated.ll
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
; RUN: opt -passes='devirt<0>(inline)' < %s -S | FileCheck %s

; CHECK-NOT: internal
; CHECK: define void @e()
; CHECK-NOT: internal

define void @e() {
entry:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ define internal i32 @caller(ptr %B) {
; CGSCC-LABEL: define {{[^@]+}}@caller
; CGSCC-SAME: () #[[ATTR0]] {
; CGSCC-NEXT: [[A:%.*]] = alloca i32, align 4
; CGSCC-NEXT: [[A2:%.*]] = alloca i8, i32 0, align 4
; CGSCC-NEXT: [[A1:%.*]] = alloca i8, i32 0, align 4
; CGSCC-NEXT: ret i32 0
;
Expand Down
Loading
Loading