Skip to content

InlineFunction: Split inlining into predicate and apply functions #134213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Apr 3, 2025

This is to support a new inline function reduction in llvm-reduce,
which should pre-filter callsites that are not eligible for inlining.

This code was mostly structured as a match and apply, with a few
exceptions. The ugliest piece is for propagating and verifying compatible
getGC and personalities. Also collection of EHPad and the convergence token
to use are now cached in InlineFunctionInfo.

I was initially confused by the split between the checks performed here
and isInlineViable, so better document how this system is supposed to work.
It turns out this split does make sense, in that isInlineViable checks
if it's possible based on the callee content and the ultimate inline
depended on the callsite context. I think more renames of these functions
would help, and isInlineViable should probably move out of InlineCost to be
with these transfoms.

Copy link
Contributor Author

arsenm commented Apr 3, 2025

@arsenm arsenm added ipo Interprocedural optimizations llvm:transforms labels Apr 3, 2025 — with Graphite App
@arsenm arsenm marked this pull request as ready for review April 3, 2025 07:25
@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2025

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Matt Arsenault (arsenm)

Changes

This is to support a new inline function reduction in llvm-reduce,
which should pre-filter callsites that are not eligible for inlining.

This code was mostly structured as a match and apply, with a few
exceptions. The ugliest piece is for propagating and verifying compatible
getGC and personalities. Also collection of EHPad and the convergence token
to use are now cached in InlineFunctionInfo.

I was initially confused by the split between the checks performed here
and isInlineViable, so better document how this system is supposed to work.
It turns out this split does make sense, in that isInlineViable checks
if it's possible based on the callee content and the ultimate inline
depended on the callsite context. I think more renames of these functions
would help, and isInlineViable should probably move out of InlineCost to be
with these transfoms.


Full diff: https://github.com/llvm/llvm-project/pull/134213.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Analysis/InlineCost.h (+5-1)
  • (modified) llvm/include/llvm/Transforms/Utils/Cloning.h (+28)
  • (modified) llvm/lib/Transforms/Utils/InlineFunction.cpp (+81-42)
diff --git a/llvm/include/llvm/Analysis/InlineCost.h b/llvm/include/llvm/Analysis/InlineCost.h
index 90ee75773957a..ec59d54954e16 100644
--- a/llvm/include/llvm/Analysis/InlineCost.h
+++ b/llvm/include/llvm/Analysis/InlineCost.h
@@ -334,7 +334,11 @@ std::optional<InlineCostFeatures> getInliningCostFeatures(
     ProfileSummaryInfo *PSI = nullptr,
     OptimizationRemarkEmitter *ORE = nullptr);
 
-/// Minimal filter to detect invalid constructs for inlining.
+/// Check if it is mechanically possible to inline the function \p Callee, based
+/// on the contents of the function.
+///
+/// See also \p CanInlineCallSite as an additional precondition necessary to
+/// perform a valid inline in a particular use context.
 InlineResult isInlineViable(Function &Callee);
 
 // This pass is used to annotate instructions during the inline process for
diff --git a/llvm/include/llvm/Transforms/Utils/Cloning.h b/llvm/include/llvm/Transforms/Utils/Cloning.h
index ec1a1d5faa7e9..201e6ba2b491f 100644
--- a/llvm/include/llvm/Transforms/Utils/Cloning.h
+++ b/llvm/include/llvm/Transforms/Utils/Cloning.h
@@ -263,6 +263,9 @@ class InlineFunctionInfo {
   /// `InlinedCalls` above is used.
   SmallVector<CallBase *, 8> InlinedCallSites;
 
+  Value *ConvergenceControlToken = nullptr;
+  Instruction *CallSiteEHPad = nullptr;
+
   /// Update profile for callee as well as cloned version. We need to do this
   /// for regular inlining, but not for inlining from sample profile loader.
   bool UpdateProfile;
@@ -271,9 +274,34 @@ class InlineFunctionInfo {
     StaticAllocas.clear();
     InlinedCalls.clear();
     InlinedCallSites.clear();
+    ConvergenceControlToken = nullptr;
+    CallSiteEHPad = nullptr;
   }
 };
 
+/// Check if it is legal to perform inlining of the function called by \p CB
+/// into the caller at this particular use, and sets fields in \p IFI.
+///
+/// This does not consider whether it is possible for the function callee itself
+/// to be inlined; for that see isInlineViable.
+InlineResult CanInlineCallSite(const CallBase &CB, InlineFunctionInfo &IFI);
+
+/// This should generally not be used, use InlineFunction instead.
+///
+/// Perform mechanical inlining of \p CB into the caller.
+///
+/// This does not perform any legality or profitability checks for the
+/// inlining. This assumes that CanInlineCallSite was already called, populated
+/// \p IFI, and returned InlineResult::success.
+///
+/// Also assumes that isInlineViable returned InlineResult::success for the
+/// called function.
+void InlineFunctionImpl(CallBase &CB, InlineFunctionInfo &IFI,
+                        bool MergeAttributes = false,
+                        AAResults *CalleeAAR = nullptr,
+                        bool InsertLifetime = true,
+                        Function *ForwardVarArgsTo = nullptr);
+
 /// This function inlines the called function into the basic
 /// block of the caller.  This returns false if it is not possible to inline
 /// this call.  The program is still in a well defined state if this occurs
diff --git a/llvm/lib/Transforms/Utils/InlineFunction.cpp b/llvm/lib/Transforms/Utils/InlineFunction.cpp
index 131fbe654c11c..7236cc0131eb9 100644
--- a/llvm/lib/Transforms/Utils/InlineFunction.cpp
+++ b/llvm/lib/Transforms/Utils/InlineFunction.cpp
@@ -2446,19 +2446,8 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
   return Ret;
 }
 
-/// This function inlines the called function into the basic block of the
-/// caller. This returns false if it is not possible to inline this call.
-/// The program is still in a well defined state if this occurs though.
-///
-/// Note that this only does one level of inlining.  For example, if the
-/// instruction 'call B' is inlined, and 'B' calls 'C', then the call to 'C' now
-/// exists in the instruction stream.  Similarly this will inline a recursive
-/// function by one level.
-llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
-                                        bool MergeAttributes,
-                                        AAResults *CalleeAAR,
-                                        bool InsertLifetime,
-                                        Function *ForwardVarArgsTo) {
+llvm::InlineResult llvm::CanInlineCallSite(const CallBase &CB,
+                                           InlineFunctionInfo &IFI) {
   assert(CB.getParent() && CB.getFunction() && "Instruction not in function!");
 
   // FIXME: we don't inline callbr yet.
@@ -2475,7 +2464,6 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
 
   // The inliner does not know how to inline through calls with operand bundles
   // in general ...
-  Value *ConvergenceControlToken = nullptr;
   if (CB.hasOperandBundles()) {
     for (int i = 0, e = CB.getNumOperandBundles(); i != e; ++i) {
       auto OBUse = CB.getOperandBundleAt(i);
@@ -2491,7 +2479,7 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
       if (Tag == LLVMContext::OB_kcfi)
         continue;
       if (Tag == LLVMContext::OB_convergencectrl) {
-        ConvergenceControlToken = OBUse.Inputs[0].get();
+        IFI.ConvergenceControlToken = OBUse.Inputs[0].get();
         continue;
       }
 
@@ -2509,28 +2497,22 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
   // fully implements convergence control tokens, there is no mixing of
   // controlled and uncontrolled convergent operations in the whole program.
   if (CB.isConvergent()) {
-    if (!ConvergenceControlToken &&
+    if (!IFI.ConvergenceControlToken &&
         getConvergenceEntry(CalledFunc->getEntryBlock())) {
       return InlineResult::failure(
           "convergent call needs convergencectrl operand");
     }
   }
 
-  // If the call to the callee cannot throw, set the 'nounwind' flag on any
-  // calls that we inline.
-  bool MarkNoUnwind = CB.doesNotThrow();
-
-  BasicBlock *OrigBB = CB.getParent();
-  Function *Caller = OrigBB->getParent();
+  const BasicBlock *OrigBB = CB.getParent();
+  const Function *Caller = OrigBB->getParent();
 
   // GC poses two hazards to inlining, which only occur when the callee has GC:
   //  1. If the caller has no GC, then the callee's GC must be propagated to the
   //     caller.
   //  2. If the caller has a differing GC, it is invalid to inline.
   if (CalledFunc->hasGC()) {
-    if (!Caller->hasGC())
-      Caller->setGC(CalledFunc->getGC());
-    else if (CalledFunc->getGC() != Caller->getGC())
+    if (Caller->hasGC() && CalledFunc->getGC() != Caller->getGC())
       return InlineResult::failure("incompatible GC");
   }
 
@@ -2548,34 +2530,31 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
           ? Caller->getPersonalityFn()->stripPointerCasts()
           : nullptr;
   if (CalledPersonality) {
-    if (!CallerPersonality)
-      Caller->setPersonalityFn(CalledPersonality);
     // If the personality functions match, then we can perform the
     // inlining. Otherwise, we can't inline.
     // TODO: This isn't 100% true. Some personality functions are proper
     //       supersets of others and can be used in place of the other.
-    else if (CalledPersonality != CallerPersonality)
+    if (CallerPersonality && CalledPersonality != CallerPersonality)
       return InlineResult::failure("incompatible personality");
   }
 
   // We need to figure out which funclet the callsite was in so that we may
   // properly nest the callee.
-  Instruction *CallSiteEHPad = nullptr;
   if (CallerPersonality) {
     EHPersonality Personality = classifyEHPersonality(CallerPersonality);
     if (isScopedEHPersonality(Personality)) {
       std::optional<OperandBundleUse> ParentFunclet =
           CB.getOperandBundle(LLVMContext::OB_funclet);
       if (ParentFunclet)
-        CallSiteEHPad = cast<FuncletPadInst>(ParentFunclet->Inputs.front());
+        IFI.CallSiteEHPad = cast<FuncletPadInst>(ParentFunclet->Inputs.front());
 
       // OK, the inlining site is legal.  What about the target function?
 
-      if (CallSiteEHPad) {
+      if (IFI.CallSiteEHPad) {
         if (Personality == EHPersonality::MSVC_CXX) {
           // The MSVC personality cannot tolerate catches getting inlined into
           // cleanup funclets.
-          if (isa<CleanupPadInst>(CallSiteEHPad)) {
+          if (isa<CleanupPadInst>(IFI.CallSiteEHPad)) {
             // Ok, the call site is within a cleanuppad.  Let's check the callee
             // for catchpads.
             for (const BasicBlock &CalledBB : *CalledFunc) {
@@ -2595,13 +2574,33 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
     }
   }
 
+  return InlineResult::success();
+}
+
+/// This function inlines the called function into the basic block of the
+/// caller. This returns false if it is not possible to inline this call.
+/// The program is still in a well defined state if this occurs though.
+///
+/// Note that this only does one level of inlining.  For example, if the
+/// instruction 'call B' is inlined, and 'B' calls 'C', then the call to 'C' now
+/// exists in the instruction stream.  Similarly this will inline a recursive
+/// function by one level.
+void llvm::InlineFunctionImpl(CallBase &CB, InlineFunctionInfo &IFI,
+                              bool MergeAttributes, AAResults *CalleeAAR,
+                              bool InsertLifetime, Function *ForwardVarArgsTo) {
+  BasicBlock *OrigBB = CB.getParent();
+  Function *Caller = OrigBB->getParent();
+  Function *CalledFunc = CB.getCalledFunction();
+  assert(CalledFunc && !CalledFunc->isDeclaration() &&
+         "CanInlineCallSite should have verified direct call to definition");
+
   // Determine if we are dealing with a call in an EHPad which does not unwind
   // to caller.
   bool EHPadForCallUnwindsLocally = false;
-  if (CallSiteEHPad && isa<CallInst>(CB)) {
+  if (IFI.CallSiteEHPad && isa<CallInst>(CB)) {
     UnwindDestMemoTy FuncletUnwindMap;
     Value *CallSiteUnwindDestToken =
-        getUnwindDestToken(CallSiteEHPad, FuncletUnwindMap);
+        getUnwindDestToken(IFI.CallSiteEHPad, FuncletUnwindMap);
 
     EHPadForCallUnwindsLocally =
         CallSiteUnwindDestToken &&
@@ -2618,6 +2617,30 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
   ClonedCodeInfo InlinedFunctionInfo;
   Function::iterator FirstNewBlock;
 
+  // GC poses two hazards to inlining, which only occur when the callee has GC:
+  //  1. If the caller has no GC, then the callee's GC must be propagated to the
+  //     caller.
+  //  2. If the caller has a differing GC, it is invalid to inline.
+  if (CalledFunc->hasGC()) {
+    if (!Caller->hasGC())
+      Caller->setGC(CalledFunc->getGC());
+    else {
+      assert(CalledFunc->getGC() == Caller->getGC() &&
+             "CanInlineCallSite should have verified compatible GCs");
+    }
+  }
+
+  if (CalledFunc->hasPersonalityFn()) {
+    Constant *CalledPersonality =
+        CalledFunc->getPersonalityFn()->stripPointerCasts();
+    if (!Caller->hasPersonalityFn()) {
+      Caller->setPersonalityFn(CalledPersonality);
+    } else
+      assert(Caller->getPersonalityFn()->stripPointerCasts() ==
+                 CalledPersonality &&
+             "CanInlineCallSite should have verified compatible personality");
+  }
+
   { // Scope to destroy VMap after cloning.
     ValueToValueMapTy VMap;
     struct ByValInit {
@@ -2802,10 +2825,10 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
             IFI.GetAssumptionCache(*Caller).registerAssumption(II);
   }
 
-  if (ConvergenceControlToken) {
+  if (IFI.ConvergenceControlToken) {
     IntrinsicInst *IntrinsicCall = getConvergenceEntry(*FirstNewBlock);
     if (IntrinsicCall) {
-      IntrinsicCall->replaceAllUsesWith(ConvergenceControlToken);
+      IntrinsicCall->replaceAllUsesWith(IFI.ConvergenceControlToken);
       IntrinsicCall->eraseFromParent();
     }
   }
@@ -2852,6 +2875,10 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
     }
   }
 
+  // If the call to the callee cannot throw, set the 'nounwind' flag on any
+  // calls that we inline.
+  bool MarkNoUnwind = CB.doesNotThrow();
+
   SmallVector<Value*,4> VarArgsToForward;
   SmallVector<AttributeSet, 4> VarArgsAttrs;
   for (unsigned i = CalledFunc->getFunctionType()->getNumParams();
@@ -3038,12 +3065,12 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
   // Update the lexical scopes of the new funclets and callsites.
   // Anything that had 'none' as its parent is now nested inside the callsite's
   // EHPad.
-  if (CallSiteEHPad) {
+  if (IFI.CallSiteEHPad) {
     for (Function::iterator BB = FirstNewBlock->getIterator(),
                             E = Caller->end();
          BB != E; ++BB) {
       // Add bundle operands to inlined call sites.
-      PropagateOperandBundles(BB, CallSiteEHPad);
+      PropagateOperandBundles(BB, IFI.CallSiteEHPad);
 
       // It is problematic if the inlinee has a cleanupret which unwinds to
       // caller and we inline it into a call site which doesn't unwind but into
@@ -3059,11 +3086,11 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
 
       if (auto *CatchSwitch = dyn_cast<CatchSwitchInst>(I)) {
         if (isa<ConstantTokenNone>(CatchSwitch->getParentPad()))
-          CatchSwitch->setParentPad(CallSiteEHPad);
+          CatchSwitch->setParentPad(IFI.CallSiteEHPad);
       } else {
         auto *FPI = cast<FuncletPadInst>(I);
         if (isa<ConstantTokenNone>(FPI->getParentPad()))
-          FPI->setParentPad(CallSiteEHPad);
+          FPI->setParentPad(IFI.CallSiteEHPad);
       }
     }
   }
@@ -3219,7 +3246,7 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
       AttributeFuncs::mergeAttributesForInlining(*Caller, *CalledFunc);
 
     // We are now done with the inlining.
-    return InlineResult::success();
+    return;
   }
 
   // Otherwise, we have the normal case, of more than one block to inline or
@@ -3379,6 +3406,18 @@ llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
 
   if (MergeAttributes)
     AttributeFuncs::mergeAttributesForInlining(*Caller, *CalledFunc);
+}
 
-  return InlineResult::success();
+llvm::InlineResult llvm::InlineFunction(CallBase &CB, InlineFunctionInfo &IFI,
+                                        bool MergeAttributes,
+                                        AAResults *CalleeAAR,
+                                        bool InsertLifetime,
+                                        Function *ForwardVarArgsTo) {
+  llvm::InlineResult Result = CanInlineCallSite(CB, IFI);
+  if (Result.isSuccess()) {
+    InlineFunctionImpl(CB, IFI, MergeAttributes, CalleeAAR, InsertLifetime,
+                       ForwardVarArgsTo);
+  }
+
+  return Result;
 }

@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Apr 3, 2025
@regehr
Copy link
Contributor

regehr commented Apr 3, 2025

LGTM

Base automatically changed from users/arsenm/clone-function-fix-folding-blocks-with-address-taken to main April 3, 2025 16:52
@arsenm arsenm force-pushed the users/arsenm/inline-function/split-legality-predicate-and-impl branch from 717cc01 to 61ae6ef Compare April 3, 2025 16:54
Comment on lines +297 to +298
/// Also assumes that isInlineViable returned InlineResult::success for the
/// called function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps worth an assert then?

/// on the contents of the function.
///
/// See also \p CanInlineCallSite as an additional precondition necessary to
/// perform a valid inline in a particular use context.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd be curious to know more about why someone would call isInlineViable vs CanInlineCallsite? The names alone sound like synonyms to me...

Is one doing checks the other should be doing, or vice versa? If so, perhaps they should just be one function...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should not be one function, and I think I think the split makes sense, but the names need work. A wrapper function which calls the two would be useful, but they should be available as the split parts.

isInlineViable is a restriction on ever inlining the function based on the callee's properties. As in, the inliner cannot physically inline the function. You can pre-filter handling functions which will never be inlined.

CanInlineCallsite is for context sensitive restrictions at the specific callsite. It's a property of the caller, more than the callee

This is to support a new inline function reduction in llvm-reduce,
which should pre-filter callsites that are not eligible for inlining.

This code was mostly structured as a match and apply, with a few
exceptions. The ugliest piece is for propagating and verifying compatible
getGC and personalities. Also collection of EHPad and the convergence token
to use are now cached in InlineFunctionInfo.

I was initially confused by the split between the checks performed here
and isInlineViable, so better document how this system is supposed to work.
It turns out this split does make sense, in that isInlineViable checks
if it's possible based on the callee content and the ultimate inline
depended on the callsite context. I think more renames of these functions
would help, and isInlineViable should probably move out of InlineCost to be
with these transfoms.
@arsenm arsenm force-pushed the users/arsenm/inline-function/split-legality-predicate-and-impl branch from 61ae6ef to b3f6884 Compare May 2, 2025 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ipo Interprocedural optimizations llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants