-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AlwaysInline] Avoid unnecessary BFI fetches #117750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Nikita Popov (nikic) ChangesAlwaysInliner doesn't use BFI itself, it only updates it. If BFI is not already computed, it will spend time to first compute it, and then update it. This is not necessary: If BFI is not available in the first place, there is no need to update it. This is mainly relevant in debug builds for IR that has a lot of alwaysinline functions. Full diff: https://github.com/llvm/llvm-project/pull/117750.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/IPO/AlwaysInliner.cpp b/llvm/lib/Transforms/IPO/AlwaysInliner.cpp
index 1f787c733079e9..447a0f492775da 100644
--- a/llvm/lib/Transforms/IPO/AlwaysInliner.cpp
+++ b/llvm/lib/Transforms/IPO/AlwaysInliner.cpp
@@ -34,7 +34,8 @@ bool AlwaysInlineImpl(
Module &M, bool InsertLifetime, ProfileSummaryInfo &PSI,
function_ref<AssumptionCache &(Function &)> GetAssumptionCache,
function_ref<AAResults &(Function &)> GetAAR,
- function_ref<BlockFrequencyInfo &(Function &)> GetBFI) {
+ function_ref<BlockFrequencyInfo &(Function &)> GetBFI,
+ function_ref<BlockFrequencyInfo *(Function &)> GetCachedBFI) {
SmallSetVector<CallBase *, 16> Calls;
bool Changed = false;
SmallVector<Function *, 16> InlinedComdatFunctions;
@@ -61,9 +62,11 @@ bool AlwaysInlineImpl(
DebugLoc DLoc = CB->getDebugLoc();
BasicBlock *Block = CB->getParent();
- InlineFunctionInfo IFI(GetAssumptionCache, &PSI,
- GetBFI ? &GetBFI(*Caller) : nullptr,
- GetBFI ? &GetBFI(F) : nullptr);
+ // Only update CallerBFI if already available. The CallerBFI update
+ // requires CalleeBFI.
+ BlockFrequencyInfo *CallerBFI = GetCachedBFI(*Caller);
+ InlineFunctionInfo IFI(GetAssumptionCache, &PSI, CallerBFI,
+ CallerBFI ? &GetBFI(F) : nullptr);
InlineResult Res = InlineFunction(*CB, IFI, /*MergeAttributes=*/true,
&GetAAR(F), InsertLifetime);
@@ -133,9 +136,12 @@ struct AlwaysInlinerLegacyPass : public ModulePass {
auto GetAssumptionCache = [&](Function &F) -> AssumptionCache & {
return getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
};
+ auto GetCachedBFI = [&](Function &) -> BlockFrequencyInfo * {
+ return nullptr;
+ };
return AlwaysInlineImpl(M, InsertLifetime, PSI, GetAssumptionCache, GetAAR,
- /*GetBFI*/ nullptr);
+ /*GetBFI=*/nullptr, GetCachedBFI);
}
static char ID; // Pass identification, replacement for typeid
@@ -172,13 +178,16 @@ PreservedAnalyses AlwaysInlinerPass::run(Module &M,
auto GetBFI = [&](Function &F) -> BlockFrequencyInfo & {
return FAM.getResult<BlockFrequencyAnalysis>(F);
};
+ auto GetCachedBFI = [&](Function &F) -> BlockFrequencyInfo * {
+ return FAM.getCachedResult<BlockFrequencyAnalysis>(F);
+ };
auto GetAAR = [&](Function &F) -> AAResults & {
return FAM.getResult<AAManager>(F);
};
auto &PSI = MAM.getResult<ProfileSummaryAnalysis>(M);
bool Changed = AlwaysInlineImpl(M, InsertLifetime, PSI, GetAssumptionCache,
- GetAAR, GetBFI);
+ GetAAR, GetBFI, GetCachedBFI);
return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Co-authored-by: Florian Hahn <[email protected]>
Hi, I noticed that the following starts crashing with this patch:
|
Thanks for the report. Also reproduces with The problem here is that we can't really fetch analyses during inlining at all, because AlwaysInliner doesn't invalidate existing analyses incrementally (i.e. for each modified function). But then, now that I look closed, if we actually inline anything, then we're not preserving any analysis at all, so there is no need to update BFI in the first place. |
This is a followup to llvm#117750. Currently, AlwaysInline only invalidates analyses at the end, by returning that no analyses are preserved. However, this means that analyses fetched during inlining may be outdated. The aforementioned PR exposed this issue. Instead, bring the logic closer to what the normal inliner does, by directly invalidating the caller in FAM. This should make sure that we don't receive any outdated analyses even if they are fetched during inlining. Also drop the BFI updating entirely -- there's no point in doing it if we're going to invalidate everything anyway.
Candidate fix: #119566 |
This is a followup to llvm#117750. Currently, AlwaysInline only invalidates analyses at the end, by returning that no analyses are preserved. However, this means that analyses fetched during inlining may be outdated. The aforementioned PR exposed this issue. Instead, bring the logic closer to what the normal inliner does, by directly invalidating the caller in FAM. This should make sure that we don't receive any outdated analyses even if they are fetched during inlining. Also drop the BFI updating entirely -- there's no point in doing it if we're going to invalidate everything anyway.
This is a followup to #117750. Currently, AlwaysInline only invalidates analyses at the end, by returning that no analyses are preserved. However, this means that analyses fetched during inlining may be outdated. The aforementioned PR exposed this issue. Instead, bring the logic closer to what the normal inliner does, by directly invalidating the caller in FAM. This should make sure that we don't receive any outdated analyses even if they are fetched during inlining. Also drop the BFI updating entirely -- there's no point in doing it if we're going to invalidate everything anyway.
Thank you! I've verifed that the fix you pushed solves the original problem I saw. |
AlwaysInliner doesn't use BFI itself, it only updates it. If BFI is not already computed, it will spend time to first compute it, and then update it. This is not necessary: If BFI is not available in the first place, there is no need to update it.
This is mainly relevant in debug builds for IR that has a lot of alwaysinline functions.