Skip to content

[ctxprof] Only prune the profile in modules containing only context trees #134340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented Apr 4, 2025

We will subsequently treat the whole profile as "flat" in the frontend, (i.e flatten and combine with the flat profile section), so we can have a profile for ThinLTO for parts of the application that don't come under the contextual profile. After ThinLTO, we will treat the module(s) containing contextual trees differently: they'll have only the contextual profile pertinent to them. The rest of the modules (non-contextual) will proceed "as usual", off the flattened profile.

This patch implements pruning of the contextual profile to enable the above.

@mtrofin mtrofin force-pushed the users/mtrofin/04-03-_ctxprof_only_prune_the_profile_in_modules_containing_only_context_trees branch from c0b7038 to bbb3874 Compare April 4, 2025 03:04
@mtrofin mtrofin marked this pull request as ready for review April 4, 2025 03:08
@llvmbot llvmbot added llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Apr 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 4, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: Mircea Trofin (mtrofin)

Changes

We will subsequently treat the whole profile as "flat" in the frontend, so we can have a profile for ThinLTO for parts of the application that don't come under the contextual profile. After ThinLTO, we will treat the module(s) containing contextual trees differently: they'll have only the contextual profile pertinent to them. The rest of the modules (non-contextual) will proceed "as usual", off the flattened profile.

This patch implements pruning of the contextual profile to enable the above.


Full diff: https://github.com/llvm/llvm-project/pull/134340.diff

6 Files Affected:

  • (modified) llvm/include/llvm/Analysis/CtxProfAnalysis.h (+5)
  • (modified) llvm/lib/Analysis/CtxProfAnalysis.cpp (+36-18)
  • (modified) llvm/lib/Transforms/IPO/ElimAvailExtern.cpp (+1-1)
  • (modified) llvm/test/Analysis/CtxProfAnalysis/load.ll (+4)
  • (added) llvm/test/Analysis/CtxProfAnalysis/pruning.ll (+69)
  • (modified) llvm/test/Transforms/EliminateAvailableExternally/transform-to-local.ll (+13-3)
diff --git a/llvm/include/llvm/Analysis/CtxProfAnalysis.h b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
index ede8bd2fe5001..f7f65458c16a9 100644
--- a/llvm/include/llvm/Analysis/CtxProfAnalysis.h
+++ b/llvm/include/llvm/Analysis/CtxProfAnalysis.h
@@ -33,6 +33,7 @@ class PGOContextualProfile {
     FunctionInfo(StringRef Name) : Name(Name) {}
   };
   PGOCtxProfile Profiles;
+  bool IsInSpecializedModule = false;
   // For the GUIDs in this module, associate metadata about each function which
   // we'll need when we maintain the profiles during IPO transformations.
   std::map<GlobalValue::GUID, FunctionInfo> FuncInfo;
@@ -56,6 +57,10 @@ class PGOContextualProfile {
 
   const PGOCtxProfile &profiles() const { return Profiles; }
 
+  // True if this module is a post-thinlto module containing just functions
+  // participating in one or more contextual profiles.
+  bool isInSpecializedModule() const { return IsInSpecializedModule; }
+
   bool isFunctionKnown(const Function &F) const {
     return getDefinedFunctionGUID(F) != 0;
   }
diff --git a/llvm/lib/Analysis/CtxProfAnalysis.cpp b/llvm/lib/Analysis/CtxProfAnalysis.cpp
index e021e2a801006..3ae333b09d0ce 100644
--- a/llvm/lib/Analysis/CtxProfAnalysis.cpp
+++ b/llvm/lib/Analysis/CtxProfAnalysis.cpp
@@ -12,6 +12,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "llvm/Analysis/CtxProfAnalysis.h"
+#include "llvm/ADT/APInt.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/IR/Analysis.h"
 #include "llvm/IR/IntrinsicInst.h"
@@ -20,6 +21,7 @@
 #include "llvm/ProfileData/PGOCtxProfReader.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/MemoryBuffer.h"
+#include "llvm/Support/Path.h"
 
 #define DEBUG_TYPE "ctx_prof"
 
@@ -95,26 +97,42 @@ PGOContextualProfile CtxProfAnalysis::run(Module &M,
     return {};
   }
 
-  DenseSet<GlobalValue::GUID> ProfileRootsInModule;
-  for (const auto &F : M)
-    if (!F.isDeclaration())
-      if (auto GUID = AssignGUIDPass::getGUID(F);
-          MaybeProfiles->Contexts.find(GUID) != MaybeProfiles->Contexts.end())
-        ProfileRootsInModule.insert(GUID);
-
-  // Trim first the roots that aren't in this module.
-  for (auto &[RootGuid, _] :
-       llvm::make_early_inc_range(MaybeProfiles->Contexts))
-    if (!ProfileRootsInModule.contains(RootGuid))
-      MaybeProfiles->Contexts.erase(RootGuid);
-  // If none of the roots are in the module, we have no profile (for this
-  // module)
-  if (MaybeProfiles->Contexts.empty())
-    return {};
-
-  // OK, so we have a valid profile and it's applicable to roots in this module.
+  // FIXME: We should drive this from ThinLTO, but for the time being, use the
+  // module name as indicator.
+  // We want to *only* keep the contextual profiles in modules that capture
+  // context trees. That allows us to compute specific PSIs, for example.
+  auto DetermineRootsInModule = [&M]() -> const DenseSet<GlobalValue::GUID> {
+    DenseSet<GlobalValue::GUID> ProfileRootsInModule;
+    auto ModName = M.getName();
+    auto Filename = sys::path::filename(ModName);
+    // Drop the file extension.
+    Filename = Filename.substr(0, Filename.find_last_of('.'));
+    // See if it parses
+    APInt Guid;
+    // getAsInteger returns true if there are more chars to read other than the
+    // integer. So the "false" test is what we want.
+    if (!Filename.getAsInteger(0, Guid))
+      ProfileRootsInModule.insert(Guid.getZExtValue());
+    return ProfileRootsInModule;
+  };
+  const auto ProfileRootsInModule = DetermineRootsInModule();
   PGOContextualProfile Result;
 
+  // the logic from here on allows for modules that contain - by design - more
+  // than one root. We currently don't support that, because the determination
+  // happens based on the module name matching the root guid, but the logic can
+  // avoid assuming that.
+  if (!ProfileRootsInModule.empty()) {
+    Result.IsInSpecializedModule = true;
+    // Trim first the roots that aren't in this module.
+    for (auto &[RootGuid, _] :
+         llvm::make_early_inc_range(MaybeProfiles->Contexts))
+      if (!ProfileRootsInModule.contains(RootGuid))
+        MaybeProfiles->Contexts.erase(RootGuid);
+    // we can also drop the flat profiles
+    MaybeProfiles->FlatProfiles.clear();
+  }
+
   for (const auto &F : M) {
     if (F.isDeclaration())
       continue;
diff --git a/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp b/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
index de11f7f6b123d..718452fc02764 100644
--- a/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
+++ b/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
@@ -134,7 +134,7 @@ EliminateAvailableExternallyPass::run(Module &M, ModuleAnalysisManager &MAM) {
   // for this contextual information. Eliding it in favor of the original would
   // undo these optimizations.
   if (!eliminateAvailableExternally(
-          M, /*Convert=*/(CtxProf && !CtxProf->contexts().empty())))
+          M, /*Convert=*/(CtxProf && CtxProf->isInSpecializedModule())))
     return PreservedAnalyses::all();
   return PreservedAnalyses::none();
 }
diff --git a/llvm/test/Analysis/CtxProfAnalysis/load.ll b/llvm/test/Analysis/CtxProfAnalysis/load.ll
index 6091a99ed3680..bd21a4b710630 100644
--- a/llvm/test/Analysis/CtxProfAnalysis/load.ll
+++ b/llvm/test/Analysis/CtxProfAnalysis/load.ll
@@ -46,6 +46,9 @@ Function Info:
 Current Profile:
 
 Contexts:
+  - Guid:            12341
+    TotalRootEntryCount: 90
+    Counters:        [ 9 ]
   - Guid:            11872291593386833696
     TotalRootEntryCount: 4
     Counters:        [ 1 ]
@@ -57,6 +60,7 @@ Contexts:
     Counters:        [ 5 ]
 
 Flat Profile:
+12341 : 9 
 728453322856651412 : 6 7 
 11872291593386833696 : 1 
 12074870348631550642 : 5 
diff --git a/llvm/test/Analysis/CtxProfAnalysis/pruning.ll b/llvm/test/Analysis/CtxProfAnalysis/pruning.ll
new file mode 100644
index 0000000000000..65fa328ef820b
--- /dev/null
+++ b/llvm/test/Analysis/CtxProfAnalysis/pruning.ll
@@ -0,0 +1,69 @@
+; REQUIRES: x86_64-linux
+;
+; Check that we don't prune the contextual profile, unless the module name
+; matches the guid of the root.
+;
+; RUN: rm -rf %t
+; RUN: split-file %s %t
+; RUN: llvm-ctxprof-util fromYAML --input=%t/profile.yaml --output=%t/profile.ctxprofdata
+;
+; RUN: cp %t/example.ll %t/1234.ll
+; RUN: cp %t/example.ll %t/0x4d2.ll
+;
+; RUN: opt -passes='require<ctx-prof-analysis>,print<ctx-prof-analysis>' \
+; RUN:   -use-ctx-profile=%t/profile.ctxprofdata \
+; RUN:   -ctx-profile-printer-level=everything \
+; RUN:   %t/example.ll -S 2>&1 | FileCheck %s
+
+; RUN: opt -passes='require<ctx-prof-analysis>,print<ctx-prof-analysis>' \
+; RUN:   -use-ctx-profile=%t/profile.ctxprofdata \
+; RUN:   -ctx-profile-printer-level=everything \
+; RUN:   %t/not-matching.ll -S 2>&1 | FileCheck %s
+
+; RUN: opt -passes='require<ctx-prof-analysis>,print<ctx-prof-analysis>' \
+; RUN:   -use-ctx-profile=%t/profile.ctxprofdata \
+; RUN:   -ctx-profile-printer-level=everything \
+; RUN:   %t/0x4d2.ll -S 2>&1 | FileCheck %s --check-prefix=PRUNED
+
+; CHECK:    Contexts:
+; CHECK:      - Guid:            1234
+; CHECK:      - Guid:            5678
+; CHECK:    FlatProfiles:
+; PRUNED-NOT: - Guid:            5678
+; PRUNED-NOT: FlatProfiles
+;
+; pick a large GUID that would be negative, if signed, to test a few ways the
+; file name may be formatted.
+;--- profile.yaml
+Contexts:
+  - Guid: 1234
+    TotalRootEntryCount: 24
+    Counters: [9]
+    Callsites:  -
+                  - Guid: 1000
+                    Counters: [6, 7]
+
+  - Guid: 5678
+    TotalRootEntryCount: 24
+    Counters: [9]
+    Callsites:  -
+                  - Guid: 1000
+                    Counters: [6, 7]
+FlatProfiles:
+  - Guid: 777
+    Counters: [2]                    
+;--- example.ll
+define void @an_entrypoint(i32 %a) !guid !0 {
+  ret void
+}
+
+attributes #0 = { noinline }
+!0 = !{ i64  1234 }
+
+;--- not-matching.ll
+define void @an_entrypoint(i32 %a) !guid !0 {
+  ret void
+}
+
+attributes #0 = { noinline }
+!0 = !{ i64  1000 }
diff --git a/llvm/test/Transforms/EliminateAvailableExternally/transform-to-local.ll b/llvm/test/Transforms/EliminateAvailableExternally/transform-to-local.ll
index b6465f44faa18..c24e4925fe78f 100644
--- a/llvm/test/Transforms/EliminateAvailableExternally/transform-to-local.ll
+++ b/llvm/test/Transforms/EliminateAvailableExternally/transform-to-local.ll
@@ -1,16 +1,26 @@
 ; REQUIRES: asserts
-; RUN: opt -passes=elim-avail-extern -avail-extern-to-local -stats -S 2>&1 < %s | FileCheck %s
+; RUN: rm -rf %t
+; RUN: mkdir %t
+; RUN: cp %s %t/1234.ll
+;
+; default behavior
+; RUN: opt -passes=elim-avail-extern -stats -S 2>&1 %s | FileCheck %s --check-prefix=NOOP
+;
+; check the -avail-extern-to-local flag works as intended
+; RUN: opt -passes=elim-avail-extern -avail-extern-to-local -stats -S 2>&1 %t/1234.ll | FileCheck %s
+; RUN: opt -passes=elim-avail-extern -avail-extern-to-local -stats -S 2>&1 %s | FileCheck %s
 ;
 ; RUN: echo '{"Contexts": [{"Guid":1234, "TotalRootEntryCount": 5, "Counters": [1]}]}' | llvm-ctxprof-util fromYAML --input=- --output=%t_profile.ctxprofdata
 ;
 ; Because we pass a contextual profile with a root defined in this module, we expect the outcome to be the same as-if
 ; we passed -avail-extern-to-local, i.e. available_externally don't get elided and instead get converted to local linkage
-; RUN: opt -passes='assign-guid,require<ctx-prof-analysis>,elim-avail-extern' -use-ctx-profile=%t_profile.ctxprofdata -stats -S 2>&1 < %s | FileCheck %s
+;
+; RUN: opt -passes='assign-guid,require<ctx-prof-analysis>,elim-avail-extern' -use-ctx-profile=%t_profile.ctxprofdata -stats -S 2>&1 %t/1234.ll | FileCheck %s
 
 ; If the profile doesn't apply to this module, available_externally won't get converted to internal linkage, and will be
 ; removed instead.
 ; RUN: echo '{"Contexts": [{"Guid":5678, "TotalRootEntryCount": 3, "Counters": [1]}]}' | llvm-ctxprof-util fromYAML --input=- --output=%t_profile_bad.ctxprofdata
-; RUN: opt -passes='assign-guid,require<ctx-prof-analysis>,elim-avail-extern' -use-ctx-profile=%t_profile_bad.ctxprofdata -stats -S 2>&1 < %s | FileCheck %s --check-prefix=NOOP
+; RUN: opt -passes='assign-guid,require<ctx-prof-analysis>,elim-avail-extern' -use-ctx-profile=%t_profile_bad.ctxprofdata -stats -S %s 2>&1 | FileCheck %s --check-prefix=NOOP
 
 declare void @call_out(ptr %fct)
 

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines 60 to 61
// True if this module is a post-thinlto module containing just functions
// participating in one or more contextual profiles.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the other comments are on the members instead of the accessor. Lets move this and make it consistent?

@@ -0,0 +1,69 @@
; REQUIRES: x86_64-linux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need for it to be x86_64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of - the runtime support is only for x86_64 and linux, so would rather not give the impression any of this would work elsewhere at this point.

@mtrofin mtrofin force-pushed the users/mtrofin/04-03-_ctxprof_only_prune_the_profile_in_modules_containing_only_context_trees branch from bbb3874 to 00357fe Compare April 8, 2025 00:31
Copy link
Member Author

mtrofin commented Apr 8, 2025

Merge activity

  • Apr 7, 10:45 PM EDT: A user started a stack merge that includes this pull request via Graphite.
  • Apr 7, 10:47 PM EDT: Graphite rebased this pull request as part of a merge.
  • Apr 7, 10:50 PM EDT: Graphite rebased this pull request as part of a merge.
  • Apr 7, 10:52 PM EDT: A user merged this pull request with Graphite.

@mtrofin mtrofin force-pushed the users/mtrofin/04-03-_ctxprof_only_prune_the_profile_in_modules_containing_only_context_trees branch from 00357fe to 0615ca5 Compare April 8, 2025 02:47
@mtrofin mtrofin force-pushed the users/mtrofin/04-03-_ctxprof_only_prune_the_profile_in_modules_containing_only_context_trees branch from 0615ca5 to c61e22f Compare April 8, 2025 02:49
@mtrofin mtrofin merged commit 6a3e5f8 into main Apr 8, 2025
6 of 10 checks passed
@mtrofin mtrofin deleted the users/mtrofin/04-03-_ctxprof_only_prune_the_profile_in_modules_containing_only_context_trees branch April 8, 2025 02:52
mtrofin added a commit that referenced this pull request Apr 8, 2025
…l profile (#134468)

After #134340, the availability of contextual profile isn't in itself an indication of compiling the module containing all the functions covered by that profile.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants