Skip to content

[CodeGen][StaticDataPartitioning]Place local-linkage global variables in hot or unlikely prefixed sections based on profile information #125756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Mar 28, 2025

Conversation

mingmingl-llvm
Copy link
Contributor

In this PR, static-data-splitter pass finds out the local-linkage global variables in {.rodata, .data.rel.ro, bss, .data} sections by analyzing machine instruction operands, and aggregates their accesses from code across functions.

A follow-up item is to analyze global variable initializers and count for access from data.

  • This limitation is demonstrated by bss2 and data3 in llvm/test/CodeGen/X86/global-variable-partition.ll.

Some stats of static-data-splitter with this patch:

section bss rodata data
hot-prefixed section coverage 99.75% 97.71% 91.30%
unlikely-prefixed section size percentage 67.94% 39.37% 63.10%
  1. The coverage is defined as #perf-sample-in-hot-prefixed <data> section / #perf-sample in <data.*> section for each section.
    • The perf command samples MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2 events at a high frequency (perf -c 2251) for 30 seconds. The profiled binary is built as non-PIE so data.rel.ro coverage data is not available.
  2. The unlikely-prefixed <data> section size percentage is defined as unlikely <data> section size / the sum size of <data>.* sections for each <data> section

@llvmbot
Copy link
Member

llvmbot commented Feb 4, 2025

@llvm/pr-subscribers-llvm-analysis
@llvm/pr-subscribers-pgo
@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-ir

Author: Mingming Liu (mingmingl-llvm)

Changes

In this PR, static-data-splitter pass finds out the local-linkage global variables in {.rodata, .data.rel.ro, bss, .data} sections by analyzing machine instruction operands, and aggregates their accesses from code across functions.

A follow-up item is to analyze global variable initializers and count for access from data.

  • This limitation is demonstrated by bss2 and data3 in llvm/test/CodeGen/X86/global-variable-partition.ll.

Some stats of static-data-splitter with this patch:

section bss rodata data
hot-prefixed section coverage 99.75% 97.71% 91.30%
unlikely-prefixed section size percentage 67.94% 39.37% 63.10%
  1. The coverage is defined as #perf-sample-in-hot-prefixed &lt;data&gt; section / #perf-sample in &lt;data.*&gt; section for each <data> section.
    • The perf command samples MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2 events at a high frequency (perf -c 2251) for 30 seconds. The profiled binary is built as non-PIE so data.rel.ro coverage data is not available.
  2. The unlikely-prefixed &lt;data&gt; section size percentage is defined as unlikely &lt;data&gt; section size / the sum size of &lt;data&gt;.* sections for each &lt;data&gt; section

Patch is 28.13 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125756.diff

14 Files Affected:

  • (modified) llvm/include/llvm/IR/Function.h (-6)
  • (modified) llvm/include/llvm/IR/GlobalObject.h (+11)
  • (modified) llvm/include/llvm/IR/MDBuilder.h (+2-2)
  • (modified) llvm/lib/CodeGen/StaticDataSplitter.cpp (+149-52)
  • (modified) llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp (+6)
  • (modified) llvm/lib/IR/Function.cpp (-16)
  • (modified) llvm/lib/IR/Globals.cpp (+30)
  • (modified) llvm/lib/IR/MDBuilder.cpp (+3-3)
  • (added) llvm/test/CodeGen/X86/data-section-prefix.ll (+27)
  • (added) llvm/test/CodeGen/X86/global-variable-partition.ll (+165)
  • (modified) llvm/test/Transforms/CodeGenPrepare/X86/section-samplepgo.ll (+2-2)
  • (modified) llvm/test/Transforms/CodeGenPrepare/X86/section.ll (+2-2)
  • (modified) llvm/test/Transforms/HotColdSplit/coldentrycount.ll (+2-2)
  • (modified) llvm/test/Transforms/SampleProfile/section-accurate-samplepgo.ll (+3-3)
diff --git a/llvm/include/llvm/IR/Function.h b/llvm/include/llvm/IR/Function.h
index fcd5396ccfdbc87..29041688124bc29 100644
--- a/llvm/include/llvm/IR/Function.h
+++ b/llvm/include/llvm/IR/Function.h
@@ -346,12 +346,6 @@ class LLVM_ABI Function : public GlobalObject, public ilist_node<Function> {
   /// sample PGO, to enable the same inlines as the profiled optimized binary.
   DenseSet<GlobalValue::GUID> getImportGUIDs() const;
 
-  /// Set the section prefix for this function.
-  void setSectionPrefix(StringRef Prefix);
-
-  /// Get the section prefix for this function.
-  std::optional<StringRef> getSectionPrefix() const;
-
   /// hasGC/getGC/setGC/clearGC - The name of the garbage collection algorithm
   ///                             to use during code generation.
   bool hasGC() const {
diff --git a/llvm/include/llvm/IR/GlobalObject.h b/llvm/include/llvm/IR/GlobalObject.h
index 08edc13d81f880a..bb50c39813e1407 100644
--- a/llvm/include/llvm/IR/GlobalObject.h
+++ b/llvm/include/llvm/IR/GlobalObject.h
@@ -124,6 +124,17 @@ class GlobalObject : public GlobalValue {
   /// appropriate default object file section.
   void setSection(StringRef S);
 
+  /// Set the section prefix for this global object.
+  void setSectionPrefix(StringRef Prefix);
+
+  /// Update the section prefix, unless the existing prefix is the same as
+  /// `KeepPrefix`.
+  void updateSectionPrefix(StringRef Prefix,
+                           std::optional<StringRef> KeepPrefix = std::nullopt);
+
+  /// Get the section prefix for this global object.
+  std::optional<StringRef> getSectionPrefix() const;
+
   bool hasComdat() const { return getComdat() != nullptr; }
   const Comdat *getComdat() const { return ObjComdat; }
   Comdat *getComdat() { return ObjComdat; }
diff --git a/llvm/include/llvm/IR/MDBuilder.h b/llvm/include/llvm/IR/MDBuilder.h
index e02ec8f5a3d8bb1..ce4e1da656049d1 100644
--- a/llvm/include/llvm/IR/MDBuilder.h
+++ b/llvm/include/llvm/IR/MDBuilder.h
@@ -89,8 +89,8 @@ class MDBuilder {
   MDNode *createFunctionEntryCount(uint64_t Count, bool Synthetic,
                                    const DenseSet<GlobalValue::GUID> *Imports);
 
-  /// Return metadata containing the section prefix for a function.
-  MDNode *createFunctionSectionPrefix(StringRef Prefix);
+  /// Return metadata containing the section prefix for a global object.
+  MDNode *createGlobalObjectSectionPrefix(StringRef Prefix);
 
   /// Return metadata containing the pseudo probe descriptor for a function.
   MDNode *createPseudoProbeDesc(uint64_t GUID, uint64_t Hash, StringRef FName);
diff --git a/llvm/lib/CodeGen/StaticDataSplitter.cpp b/llvm/lib/CodeGen/StaticDataSplitter.cpp
index e5bf0a5a3a255f6..f09e3b41e0723e6 100644
--- a/llvm/lib/CodeGen/StaticDataSplitter.cpp
+++ b/llvm/lib/CodeGen/StaticDataSplitter.cpp
@@ -9,13 +9,13 @@
 // The pass uses branch profile data to assign hotness based section qualifiers
 // for the following types of static data:
 // - Jump tables
+// - Module-internal global variables
 // - Constant pools (TODO)
-// - Other module-internal data (TODO)
 //
 // For the original RFC of this pass please see
 // https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744
 
-#include "llvm/ADT/ScopeExit.h"
+#include "llvm/ADT/APInt.h"
 #include "llvm/ADT/Statistic.h"
 #include "llvm/Analysis/ProfileSummaryInfo.h"
 #include "llvm/CodeGen/MBFIWrapper.h"
@@ -27,9 +27,12 @@
 #include "llvm/CodeGen/MachineFunctionPass.h"
 #include "llvm/CodeGen/MachineJumpTableInfo.h"
 #include "llvm/CodeGen/Passes.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/Module.h"
 #include "llvm/InitializePasses.h"
 #include "llvm/Pass.h"
 #include "llvm/Support/CommandLine.h"
+#include "llvm/Target/TargetLoweringObjectFile.h"
 
 using namespace llvm;
 
@@ -46,12 +49,27 @@ class StaticDataSplitter : public MachineFunctionPass {
   const MachineBlockFrequencyInfo *MBFI = nullptr;
   const ProfileSummaryInfo *PSI = nullptr;
 
-  // Returns true iff any jump table is hot-cold categorized.
-  bool splitJumpTables(MachineFunction &MF);
+  void updateStats(bool ProfileAvailable, const MachineJumpTableInfo *MJTI);
+  void updateJumpTableStats(bool ProfileAvailable,
+                            const MachineJumpTableInfo &MJTI);
 
-  // Same as above but works on functions with profile information.
-  bool splitJumpTablesWithProfiles(const MachineFunction &MF,
-                                   MachineJumpTableInfo &MJTI);
+  // Use profiles to partition static data.
+  bool partitionStaticDataWithProfiles(MachineFunction &MF);
+
+  // If the global value is a local linkage global variable, return it.
+  // Otherwise, return nullptr.
+  const GlobalVariable *getLocalLinkageGlobalVariable(const GlobalValue *GV);
+
+  // Returns true if the global variable is in one of {.rodata, .bss, .data,
+  // .data.rel.ro} sections
+  bool inStaticDataSection(const GlobalVariable *GV, const TargetMachine &TM);
+
+  // Iterate all global variables in the module and update the section prefix
+  // of the module-internal data.
+  void updateGlobalVariableSectionPrefix(MachineFunction &MF);
+
+  // Accummulated data profile count across machine functions in the module.
+  DenseMap<const GlobalVariable *, APInt> DataProfileCounts;
 
 public:
   static char ID;
@@ -77,13 +95,24 @@ bool StaticDataSplitter::runOnMachineFunction(MachineFunction &MF) {
   MBFI = &getAnalysis<MachineBlockFrequencyInfoWrapperPass>().getMBFI();
   PSI = &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
 
-  return splitJumpTables(MF);
+  const bool ProfileAvailable = PSI && PSI->hasProfileSummary() && MBFI &&
+                                MF.getFunction().hasProfileData();
+  bool Changed = false;
+
+  if (ProfileAvailable)
+    Changed |= partitionStaticDataWithProfiles(MF);
+
+  updateGlobalVariableSectionPrefix(MF);
+  updateStats(ProfileAvailable, MF.getJumpTableInfo());
+  return Changed;
 }
 
-bool StaticDataSplitter::splitJumpTablesWithProfiles(
-    const MachineFunction &MF, MachineJumpTableInfo &MJTI) {
+bool StaticDataSplitter::partitionStaticDataWithProfiles(MachineFunction &MF) {
   int NumChangedJumpTables = 0;
 
+  const TargetMachine &TM = MF.getTarget();
+  MachineJumpTableInfo *MJTI = MF.getJumpTableInfo();
+
   // Jump table could be used by either terminating instructions or
   // non-terminating ones, so we walk all instructions and use
   // `MachineOperand::isJTI()` to identify jump table operands.
@@ -92,63 +121,131 @@ bool StaticDataSplitter::splitJumpTablesWithProfiles(
   for (const auto &MBB : MF) {
     for (const MachineInstr &I : MBB) {
       for (const MachineOperand &Op : I.operands()) {
-        if (!Op.isJTI())
-          continue;
-        const int JTI = Op.getIndex();
-        // This is not a source block of jump table.
-        if (JTI == -1)
+        std::optional<uint64_t> Count = std::nullopt;
+        if (!Op.isJTI() && !Op.isGlobal())
           continue;
 
-        auto Hotness = MachineFunctionDataHotness::Hot;
+        Count = MBFI->getBlockProfileCount(&MBB);
+
+        if (Op.isJTI()) {
+          assert(MJTI != nullptr && "Jump table info is not available.");
+          const int JTI = Op.getIndex();
+          // This is not a source block of jump table.
+          if (JTI == -1)
+            continue;
+
+          auto Hotness = MachineFunctionDataHotness::Hot;
+
+          // Hotness is based on source basic block hotness.
+          // TODO: PSI APIs are about instruction hotness. Introduce API for
+          // data access hotness.
+          if (Count && PSI->isColdCount(*Count))
+            Hotness = MachineFunctionDataHotness::Cold;
 
-        // Hotness is based on source basic block hotness.
-        // TODO: PSI APIs are about instruction hotness. Introduce API for data
-        // access hotness.
-        if (PSI->isColdBlock(&MBB, MBFI))
-          Hotness = MachineFunctionDataHotness::Cold;
+          if (MJTI->updateJumpTableEntryHotness(JTI, Hotness))
+            ++NumChangedJumpTables;
+        } else if (Op.isGlobal()) {
+          // Find global variables with local linkage
+          const GlobalVariable *GV =
+              getLocalLinkageGlobalVariable(Op.getGlobal());
+          if (!GV || !inStaticDataSection(GV, TM))
+            continue;
 
-        if (MJTI.updateJumpTableEntryHotness(JTI, Hotness))
-          ++NumChangedJumpTables;
+          // Acccumulate data profile count across machine function
+          // instructions.
+          // TODO: Analyze global variable's initializers.
+          if (Count) {
+            auto [It, Inserted] =
+                DataProfileCounts.try_emplace(GV, APInt(128, 0));
+            It->second += *Count;
+          }
+        }
       }
     }
   }
   return NumChangedJumpTables > 0;
 }
 
-bool StaticDataSplitter::splitJumpTables(MachineFunction &MF) {
-  MachineJumpTableInfo *MJTI = MF.getJumpTableInfo();
-  if (!MJTI || MJTI->getJumpTables().empty())
-    return false;
-
-  const bool ProfileAvailable = PSI && PSI->hasProfileSummary() && MBFI &&
-                                MF.getFunction().hasProfileData();
-  auto statOnExit = llvm::make_scope_exit([&] {
-    if (!AreStatisticsEnabled())
-      return;
+void StaticDataSplitter::updateJumpTableStats(
+    bool ProfileAvailable, const MachineJumpTableInfo &MJTI) {
+  if (!ProfileAvailable) {
+    NumUnknownJumpTables += MJTI.getJumpTables().size();
+    return;
+  }
 
-    if (!ProfileAvailable) {
-      NumUnknownJumpTables += MJTI->getJumpTables().size();
-      return;
+  for (size_t JTI = 0; JTI < MJTI.getJumpTables().size(); JTI++) {
+    auto Hotness = MJTI.getJumpTables()[JTI].Hotness;
+    if (Hotness == MachineFunctionDataHotness::Hot) {
+      ++NumHotJumpTables;
+    } else {
+      assert(Hotness == MachineFunctionDataHotness::Cold &&
+             "A jump table is either hot or cold when profile information is "
+             "available.");
+      ++NumColdJumpTables;
     }
+  }
+}
 
-    for (size_t JTI = 0; JTI < MJTI->getJumpTables().size(); JTI++) {
-      auto Hotness = MJTI->getJumpTables()[JTI].Hotness;
-      if (Hotness == MachineFunctionDataHotness::Hot) {
-        ++NumHotJumpTables;
-      } else {
-        assert(Hotness == MachineFunctionDataHotness::Cold &&
-               "A jump table is either hot or cold when profile information is "
-               "available.");
-        ++NumColdJumpTables;
-      }
-    }
-  });
+void StaticDataSplitter::updateStats(bool ProfileAvailable,
+                                     const MachineJumpTableInfo *MJTI) {
+  if (!AreStatisticsEnabled())
+    return;
 
-  // Place jump tables according to block hotness if function has profile data.
-  if (ProfileAvailable)
-    return splitJumpTablesWithProfiles(MF, *MJTI);
+  if (MJTI)
+    updateJumpTableStats(ProfileAvailable, *MJTI);
+}
 
-  return true;
+const GlobalVariable *
+StaticDataSplitter::getLocalLinkageGlobalVariable(const GlobalValue *GV) {
+  if (!GV || GV->isDeclarationForLinker())
+    return nullptr;
+
+  return GV->hasLocalLinkage() ? dyn_cast<GlobalVariable>(GV) : nullptr;
+}
+
+bool StaticDataSplitter::inStaticDataSection(const GlobalVariable *GV,
+                                             const TargetMachine &TM) {
+  assert(GV && "Caller guaranteed");
+
+  // Skip LLVM reserved symbols.
+  if (GV->getName().starts_with("llvm."))
+    return false;
+
+  SectionKind Kind = TargetLoweringObjectFile::getKindForGlobal(GV, TM);
+  return Kind.isData() || Kind.isReadOnly() || Kind.isReadOnlyWithRel() ||
+         Kind.isBSS();
+}
+
+void StaticDataSplitter::updateGlobalVariableSectionPrefix(
+    MachineFunction &MF) {
+  for (GlobalVariable &GV : MF.getFunction().getParent()->globals()) {
+    if (GV.isDeclarationForLinker())
+      continue;
+    // DataProfileCounts accumulates data profile count across all machine
+    // function instructions, and it can't model the indirect accesses through
+    // other global variables' initializers.
+    // TODO: Analyze the users of module-internal global variables and see
+    // through the users' initializers. Do not place a global variable into
+    // unlikely section if any of its users are potentially hot.
+    auto Iter = DataProfileCounts.find(&GV);
+    if (Iter == DataProfileCounts.end())
+      continue;
+
+    // StaticDataSplitter is made a machine function pass rather than a module
+    // pass because (Lazy)MachineBlockFrequencyInfo is a machine-function
+    // analysis pass and cannot be used for a legacy module pass.
+    // As a result, we use `DataProfileCounts` to accumulate data
+    // profile count across machine functions and update global variable section
+    // prefix once per machine function.
+    // FIXME: Make StaticDataSplitter a module pass under new pass manager
+    // framework, and set global variable section prefix once per module after
+    // analyzing all machine functions.
+    if (PSI->isColdCount(Iter->second.getZExtValue())) {
+      GV.updateSectionPrefix("unlikely", std::make_optional(StringRef("hot")));
+    } else if (PSI->isHotCount(Iter->second.getZExtValue())) {
+      GV.updateSectionPrefix("hot");
+    }
+  }
 }
 
 char StaticDataSplitter::ID = 0;
diff --git a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
index 3c2c7c8c9fed69a..d20ab29cc197974 100644
--- a/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
@@ -670,6 +670,7 @@ getELFSectionNameForGlobal(const GlobalObject *GO, SectionKind Kind,
   }
 
   bool HasPrefix = false;
+
   if (const auto *F = dyn_cast<Function>(GO)) {
     // Jump table hotness takes precedence over its enclosing function's hotness
     // if it's known. The function's section prefix is used if jump table entry
@@ -687,6 +688,11 @@ getELFSectionNameForGlobal(const GlobalObject *GO, SectionKind Kind,
       raw_svector_ostream(Name) << '.' << *Prefix;
       HasPrefix = true;
     }
+  } else if (const auto *GV = dyn_cast<GlobalVariable>(GO)) {
+    if (std::optional<StringRef> Prefix = GV->getSectionPrefix()) {
+      raw_svector_ostream(Name) << '.' << *Prefix;
+      HasPrefix = true;
+    }
   }
 
   if (UniqueSectionName) {
diff --git a/llvm/lib/IR/Function.cpp b/llvm/lib/IR/Function.cpp
index e6f0d64d071ba67..5666f0a53866fda 100644
--- a/llvm/lib/IR/Function.cpp
+++ b/llvm/lib/IR/Function.cpp
@@ -1164,22 +1164,6 @@ DenseSet<GlobalValue::GUID> Function::getImportGUIDs() const {
   return R;
 }
 
-void Function::setSectionPrefix(StringRef Prefix) {
-  MDBuilder MDB(getContext());
-  setMetadata(LLVMContext::MD_section_prefix,
-              MDB.createFunctionSectionPrefix(Prefix));
-}
-
-std::optional<StringRef> Function::getSectionPrefix() const {
-  if (MDNode *MD = getMetadata(LLVMContext::MD_section_prefix)) {
-    assert(cast<MDString>(MD->getOperand(0))->getString() ==
-               "function_section_prefix" &&
-           "Metadata not match");
-    return cast<MDString>(MD->getOperand(1))->getString();
-  }
-  return std::nullopt;
-}
-
 bool Function::nullPointerIsDefined() const {
   return hasFnAttribute(Attribute::NullPointerIsValid);
 }
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index db5e1cb57b1bab8..884089262e4659d 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -18,6 +18,7 @@
 #include "llvm/IR/GlobalAlias.h"
 #include "llvm/IR/GlobalValue.h"
 #include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/MDBuilder.h"
 #include "llvm/IR/Module.h"
 #include "llvm/Support/Error.h"
 #include "llvm/Support/ErrorHandling.h"
@@ -286,6 +287,35 @@ void GlobalObject::setSection(StringRef S) {
   setGlobalObjectFlag(HasSectionHashEntryBit, !S.empty());
 }
 
+void GlobalObject::setSectionPrefix(StringRef Prefix) {
+  MDBuilder MDB(getContext());
+  setMetadata(LLVMContext::MD_section_prefix,
+              MDB.createGlobalObjectSectionPrefix(Prefix));
+}
+
+void GlobalObject::updateSectionPrefix(StringRef Prefix,
+                                       std::optional<StringRef> KeepPrefix) {
+  auto SectionPrefix = getSectionPrefix();
+  if (SectionPrefix && (*SectionPrefix == Prefix ||
+                        (KeepPrefix && *SectionPrefix == *KeepPrefix)))
+    return;
+
+  setSectionPrefix(Prefix);
+  return;
+}
+
+std::optional<StringRef> GlobalObject::getSectionPrefix() const {
+  if (MDNode *MD = getMetadata(LLVMContext::MD_section_prefix)) {
+    [[maybe_unused]] StringRef MDName =
+        cast<MDString>(MD->getOperand(0))->getString();
+    assert((MDName == "section_prefix" ||
+            (isa<Function>(this) && MDName == "function_section_prefix")) &&
+           "Metadata not match");
+    return cast<MDString>(MD->getOperand(1))->getString();
+  }
+  return std::nullopt;
+}
+
 bool GlobalValue::isNobuiltinFnDef() const {
   const Function *F = dyn_cast<Function>(this);
   if (!F || F->empty())
diff --git a/llvm/lib/IR/MDBuilder.cpp b/llvm/lib/IR/MDBuilder.cpp
index 26c8ab9fc36c850..b6aa8844a7eafa7 100644
--- a/llvm/lib/IR/MDBuilder.cpp
+++ b/llvm/lib/IR/MDBuilder.cpp
@@ -87,9 +87,9 @@ MDNode *MDBuilder::createFunctionEntryCount(
   return MDNode::get(Context, Ops);
 }
 
-MDNode *MDBuilder::createFunctionSectionPrefix(StringRef Prefix) {
-  return MDNode::get(
-      Context, {createString("function_section_prefix"), createString(Prefix)});
+MDNode *MDBuilder::createGlobalObjectSectionPrefix(StringRef Prefix) {
+  return MDNode::get(Context,
+                     {createString("section_prefix"), createString(Prefix)});
 }
 
 MDNode *MDBuilder::createRange(const APInt &Lo, const APInt &Hi) {
diff --git a/llvm/test/CodeGen/X86/data-section-prefix.ll b/llvm/test/CodeGen/X86/data-section-prefix.ll
new file mode 100644
index 000000000000000..4812fc70758fbce
--- /dev/null
+++ b/llvm/test/CodeGen/X86/data-section-prefix.ll
@@ -0,0 +1,27 @@
+; RUN: llc -mtriple x86_64-linux-gnu -data-sections %s -o - | FileCheck %s --check-prefix=ELF
+; RUN: llc -mtriple x86_64-linux-gnu -unique-section-names=0 -data-sections %s -o - | FileCheck %s --check-prefix=ELF-NOUNIQ
+
+; RUN: llc -mtriple x86_64-windows-msvc -data-sections %s -o - | FileCheck %s --check-prefix=COFF-MSVC
+
+; ELF: .section .data.hot.foo,
+; ELF: .section .data.bar,
+; ELF: .section .bss.unlikely.baz,
+; ELF: .section .bss.quz,
+
+; ELF-NOUNIQ: .section    .data.hot.,"aw",@progbits,unique,1
+; ELF-NOUNIQ: .section    .data,"aw",@progbits,unique,2
+; ELF-NOUNIQ: .section    .bss.unlikely.,"aw",@nobits,unique,3
+; ELF-NOUNIQ: .section    .bss,"aw",@nobits,unique,4
+
+; COFF-MSVC: .section .data,"dw",one_only,foo
+; COFF-MSVC: .section .data,"dw",one_only,bar
+; COFF-MSVC: .section .bss,"bw",one_only,baz
+; COFF-MSVC: .section .bss,"bw",one_only,quz
+
+@foo = global i32 1, !section_prefix !0
+@bar = global i32 2
+@baz = global i32 0, !section_prefix !1
+@quz = global i32 0
+
+!0 = !{!"section_prefix", !"hot"}
+!1 = !{!"section_prefix", !"unlikely"}
diff --git a/llvm/test/CodeGen/X86/global-variable-partition.ll b/llvm/test/CodeGen/X86/global-variable-partition.ll
new file mode 100644
index 000000000000000..bb77f3362406bdc
--- /dev/null
+++ b/llvm/test/CodeGen/X86/global-variable-partition.ll
@@ -0,0 +1,165 @@
+
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -enable-split-machine-functions \
+; RUN:     -partition-static-data-sections=true -data-sections=true \
+; RUN:     -unique-section-names=true -relocation-model=pic \
+; RUN:     %s -o - 2>&1 | FileCheck %s --check-prefixes=SYM,DATA
+
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -enable-split-machine-functions \
+; RUN:     -partition-static-data-sections=true -data-sections=true \
+; RUN:     -unique-section-names=false -relocation-model=pic \
+; RUN:     %s -o - 2>&1 | FileCheck %s --check-prefixes=UNIQ,DATA
+
+; RUN: llc -mtriple=x86_64-unknown-linux-gnu -enable-split-machine-functions \
+; RUN:     -partition-static-data-sections=true -data-sections=false \
+; RUN:     -unique-section-names=false -relocation-model=pic \
+; RUN:     %s -o - 2>&1 | FileCheck %s --check-prefixes=AGG,DATA
+
+; SYM: .section .rodata.str1.1.hot.
+; UNIQ: .section	.rodata.str1.1.hot.,"aMS",@progbits,1
+; AGG: .section	.rodata.str1.1.hot
+; DATA: .L.str
+; DATA:    "hot\t"
+; DATA: .L.str.1
+; DATA:    "%d\t%d\t%d\n"
+
+
+; SYM:  .sect...
[truncated]

@mingmingl-llvm
Copy link
Contributor Author

cc @Colibrow

@mingmingl-llvm
Copy link
Contributor Author

I also created #125758 and #125757 as two smaller splits of this PR.

mingmingl-llvm added a commit that referenced this pull request Feb 5, 2025
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 5, 2025
@petrhosek
Copy link
Member

Is there a reason for using .unlikely as a prefix rather than .cold? .cold seems more obvious as an opposite of .hot and we use it elsewhere in LLVM, most notably hot-cold splitting.

Comment on lines 124 to 128
std::optional<uint64_t> Count = std::nullopt;
if (!Op.isJTI() && !Op.isGlobal())
continue;

auto Hotness = MachineFunctionDataHotness::Hot;
Count = MBFI->getBlockProfileCount(&MBB);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::optional<uint64_t> Count = std::nullopt;
if (!Op.isJTI() && !Op.isGlobal())
continue;
auto Hotness = MachineFunctionDataHotness::Hot;
Count = MBFI->getBlockProfileCount(&MBB);
if (!Op.isJTI() && !Op.isGlobal())
continue;
std::optional<uint64_t> Count = MBFI->getBlockProfileCount(&MBB);

is that OK?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the change is ok.

I kept getting "Applying suggestions on deleted lines is currently not supported." error when trying to commit the suggestion. I'll make the change in the local client..

Comment on lines 21 to 27
@foo = global i32 1, !section_prefix !0
@bar = global i32 2
@baz = global i32 0, !section_prefix !1
@quz = global i32 0

!0 = !{!"section_prefix", !"hot"}
!1 = !{!"section_prefix", !"unlikely"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if data_section_prefix would make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function and global variables are derived classes of GlobalObject, and currently function section prefix metadata has function_section_prefix.

To me, data_section_prefix carries the semantic that a global variable/object is a piece of data, which may become inaccurate. Alternative to setting global object's section prefix, I can keep function's methods as they are in TOT, and implements {set,get,update}SectionPrefix for GlobalVariables (not its superclass GlobalObject) and name it global_variable_section_prefix. How does that sound?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. I get it.

Copy link
Contributor Author

@mingmingl-llvm mingmingl-llvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for using .unlikely as a prefix rather than .cold? .cold seems more obvious as an opposite of .hot and we use it elsewhere in LLVM, most notably hot-cold splitting.

@petrhosek Thanks for the question. unlikely is used to follow the conventions of functions in the codegenprepare pass. I'm open to name changes.

If identical section names (as string literals) are not de-duplicated in the relocatable object files (e.g., when compiled with -data-sections=true -unique-section-names=false), cold will give smaller section names than unlikely per data section where this matters.

Comment on lines 124 to 128
std::optional<uint64_t> Count = std::nullopt;
if (!Op.isJTI() && !Op.isGlobal())
continue;

auto Hotness = MachineFunctionDataHotness::Hot;
Count = MBFI->getBlockProfileCount(&MBB);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the change is ok.

I kept getting "Applying suggestions on deleted lines is currently not supported." error when trying to commit the suggestion. I'll make the change in the local client..

Comment on lines 21 to 27
@foo = global i32 1, !section_prefix !0
@bar = global i32 2
@baz = global i32 0, !section_prefix !1
@quz = global i32 0

!0 = !{!"section_prefix", !"hot"}
!1 = !{!"section_prefix", !"unlikely"}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function and global variables are derived classes of GlobalObject, and currently function section prefix metadata has function_section_prefix.

To me, data_section_prefix carries the semantic that a global variable/object is a piece of data, which may become inaccurate. Alternative to setting global object's section prefix, I can keep function's methods as they are in TOT, and implements {set,get,update}SectionPrefix for GlobalVariables (not its superclass GlobalObject) and name it global_variable_section_prefix. How does that sound?

@snehasish
Copy link
Contributor

Is there a reason for using .unlikely as a prefix rather than .cold? .cold seems more obvious as an opposite of .hot and we use it elsewhere in LLVM, most notably hot-cold splitting.

@petrhosek Thanks for the question. unlikely is used to follow the conventions of functions in the codegenprepare pass. I'm open to name changes.

If identical section names (as string literals) are not de-duplicated in the relocatable object files (e.g., when compiled with -data-sections=true -unique-section-names=false), cold will give smaller section names than unlikely per data section where this matters.

I believe the .cold suffix is used for symbol names whereas the .unlikely suffix is for section names. It would be inconsistent to have a .text.unlikely section and a .rodata.cold section but I don't have a particularly strong opinion.

Copy link
Contributor

@williamweixiao williamweixiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just some minor comments.

assert(GV && "Caller guaranteed");

// Skip LLVM reserved symbols.
if (GV->getName().starts_with("llvm."))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please explain why need to skip LLVM reserved symbols?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. The globals with llvm. as a name prefix are usually handled specially, by many (middle-end and back-end) passes (e.g., AsmPrinter emits special LLVM values specially), and they are skipped in this pass mostly out of conservativeness.

I moved this check before calling inStaticDataSection helper though for readability, and added a comment around the check. What do you think about this?

// framework, and set global variable section prefix once per module after
// analyzing all machine functions.
if (PSI->isColdCount(Iter->second.getZExtValue())) {
Changed |= GV.updateSectionPrefix("unlikely",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not prefix "split" instead of "unlikely" to align with "MachineFunctionSplitter"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that functions have .hot or .unlikely prefixes when they are placed as a whole (in the codegenprepare pass), and have .split as a prefix when basic blocks of a function are further categorized into hot and non-hot ones (in the MFS pass you mentioned).

Here the granularity is a piece of global variable, and .split would be a good name if a global variable itself is split into hot and cold ones. For instance, there should be hot and cold functions in a vtable, but it'd take substantial work to implement an ABI to support the split vtable entries.

}
}
}
}
return NumChangedJumpTables > 0;
}

const GlobalVariable *
StaticDataSplitter::getLocalLinkageGlobalVariable(const GlobalValue *GV) {
if (!GV || GV->isDeclarationForLinker())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can "GV->hasLocalLinkage()" cover "GV->isDeclarationForLinker()"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I haven't thought about it before.

It turns out IR verifier requires a declaration to have one of 'valid' decl linkages, and valid decl linkages must be at least external according to isValidDeclarationLinkage.

I removed GV->isDeclarationForLinker() here and added a comment.


/// Update the section prefix, unless the existing prefix is the same as
/// `KeepPrefix`.
bool updateSectionPrefix(StringRef Prefix,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to drop this after merging #125757.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

It occurred to me that counters will increase monotonically (before it becomes max), so I asserted a global variable should not have .hot prefix under if (PSI->isColdCount(Iter->second)) { around StaticDataSplitter.cpp L 224.

Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…he base class of {Function, GlobalVariable, IFunc} (llvm#125757)

This is a split of llvm#125756
* In StaticDataProfileInfo.h/cpp, add an immutable pass to keep track of
  constants and their profile information across functions in a module.
* Add a module pass, StaticDataAnnotator, to set global variable's
  section prefix based on module-wide hotness.
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Feb 28, 2025
@mingmingl-llvm
Copy link
Contributor Author

Hi folks, just a quick note for this change when you have a moment :)

@@ -0,0 +1,68 @@
#ifndef LLVM_ANALYSIS_STATICDATAPROFILEINFO_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is reasonable to put this definition into ProfileSummaryInfo.h, as it is about summary/aggregation of const profiles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I don't feel strong about having new classes in ProfileSummaryInfo.h or in new files. Actually I once thought about the former and hesitated between the two.

I ended up with the latter mainly because PSI is included in multiple cpp files, and the new classes don't need to be included as much.

As the static data partitioning work is under development and symbolized data access profiles may need helper functions and libraries, I think we can keep it this way and do refactors later where they fit. What do you think about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

return;
}
uint64_t &OriginalCount = ConstantProfileCounts[C];
OriginalCount += llvm::SaturatingAdd(*Count, OriginalCount);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should += be just =?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the catch! done.

void addConstantProfileCount(const Constant *C,
std::optional<uint64_t> Count);

/// If \p C has a count, return it. Otherwise, return std::nullopt.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it returns nullopt, it does not mean the constant has unknown profile count -- it can just mean the profile count has not been set/aggregate for the constant. When the analysis phase is done, returning nullopt seems the same has 'hasUnknownCount()' below. Perhaps clarify in the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the point that the difference between hasUnknownCount and getConstantProfileCount are subtle. I thought about this more.

Basically, StaticDataAnnotator uses both PSI and this class to decide the section prefix of a constant, so an alternative is (for this class) to provide getConstantSectionPrefix directly. The updated change implements the alternative option, and makes getConstantProfileCount to a private class method.


// Skip global variables without profile counts. The module may not be
// profiled or instrumented.
auto Count = SDPI->getConstantProfileCount(&GV);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, will hasUnknownCount GVs return nullopt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the original code, the semantic of getConstantProfileCount and hasUnknownCount are subtle on the different combinations (i.e., whether a var is used in hot/cold/lukewarm/unprofiled functions, etc)

The updated code will query a section prefix from StaticDataProfileInfo directly there. PTAL.

// prefix isn't hot already.
GV.setSectionPrefix("hot");
Changed = true;
} else if (PSI->isColdCount(*Count) && !SDPI->hasUnknownCount(&GV) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the hasUnknownCount check needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mingmingl-llvm mingmingl-llvm requested a review from david-xl March 28, 2025 05:52
@@ -0,0 +1,68 @@
#ifndef LLVM_ANALYSIS_STATICDATAPROFILEINFO_H
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

continue;

StringRef SectionPrefix = SDPI->getConstantSectionPrefix(&GV, PSI);
if (SectionPrefix.empty() || alreadyHasSectionPrefix(GV, SectionPrefix))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situation can the constant already have a section prefix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang provides a section attribute which can be attached to globals. This is commonly used in embedded applications as far as I know.

https://clang.llvm.org/docs/AttributeReference.html#section-declspec-allocate

@mingmingl-llvm I don't think we have a test which covers this condition behaviour?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situation can the constant already have a section prefix?

Good question. A search of GlobalValue::setSectionPrefix in the LLVM repo shows that it's only used for functions but not global variables, so the alreadyHasSectionPrefix should return always false currently. To be a little bit future proof, updated the patch to report error if a global variable already has a section prefix, and commented that the subsequent code doesn't handle 'update' properly.

clang provides a section attribute which can be attached to globals. This is commonly used in embedded applications as far as I know.

Thanks for pointing this out.

For test coverage purposes, the test case was updated with @cold_data_custom_foo_section and @hot_data_custom_bar_section; the former uses clang's section attribute, and the latter uses #pragma clang section .

Taking a look at [1], such global variables [1] will have explicit sections, and this work (hot-cold partitioning) won't kick in. Per offline discussion, it's desired for compiler to keep source code annotations without partitioning.

[1]

/// This method computes the appropriate section to emit the specified global
/// variable or function definition. This should not be passed external (or
/// available externally) globals.
MCSection *TargetLoweringObjectFile::SectionForGlobal(
const GlobalObject *GO, SectionKind Kind, const TargetMachine &TM) const {
// Select section name.
if (GO->hasSection())
return getExplicitSectionGlobal(GO, Kind, TM);
if (auto *GVar = dyn_cast<GlobalVariable>(GO)) {
auto Attrs = GVar->getAttributes();
if ((Attrs.hasAttribute("bss-section") && Kind.isBSS()) ||
(Attrs.hasAttribute("data-section") && Kind.isData()) ||
(Attrs.hasAttribute("relro-section") && Kind.isReadOnlyWithRel()) ||
(Attrs.hasAttribute("rodata-section") && Kind.isReadOnly())) {
return getExplicitSectionGlobal(GO, Kind, TM);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

continue;

StringRef SectionPrefix = SDPI->getConstantSectionPrefix(&GV, PSI);
if (SectionPrefix.empty() || alreadyHasSectionPrefix(GV, SectionPrefix))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang provides a section attribute which can be attached to globals. This is commonly used in embedded applications as far as I know.

https://clang.llvm.org/docs/AttributeReference.html#section-declspec-allocate

@mingmingl-llvm I don't think we have a test which covers this condition behaviour?


; This RUN command sets `-data-sections=true -unique-section-names=true` so data
; sections are uniqufied by numbers.
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -enable-split-machine-functions \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to update this test too once we remove the dependence on -enable-split-machine-functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack!

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

continue;

StringRef SectionPrefix = SDPI->getConstantSectionPrefix(&GV, PSI);
if (SectionPrefix.empty() || alreadyHasSectionPrefix(GV, SectionPrefix))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM.

@mingmingl-llvm
Copy link
Contributor Author

The presubmit failure in LLVM.tools/llvm-exegesis/RISCV/rvv/filter.test doesn't look relevant according to error message, and the test passed locally (with -DLLVM_TARGETS_TO_BUILD="X86;AArch64;RISCV"). So it might be flakiness.

@mingmingl-llvm mingmingl-llvm merged commit c8a70f4 into main Mar 28, 2025
9 of 11 checks passed
@mingmingl-llvm mingmingl-llvm deleted the users/mingmingl-llvm/spr/globalvariables branch March 28, 2025 23:31
@llvm-ci
Copy link
Collaborator

llvm-ci commented Mar 29, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-sve-vla running on linaro-g3-04 while building llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/17/builds/6850

Here is the relevant piece of the build log for the reference
Step 7 (ninja check 1) failure: stage 1 checked (failure)
...
PASS: Flang :: Driver/print-effective-triple.f90 (25166 of 97464)
PASS: Flang :: Driver/print-resource-dir.F90 (25167 of 97464)
PASS: Flang :: Driver/predefined-macros-x86.f90 (25168 of 97464)
PASS: Flang :: Driver/missing-arg.f90 (25169 of 97464)
PASS: Flang :: Driver/override-triple.ll (25170 of 97464)
PASS: Flang :: Driver/implicit-none.f90 (25171 of 97464)
PASS: Flang :: Driver/phases.f90 (25172 of 97464)
PASS: Flang :: Driver/parse-fir-error.ll (25173 of 97464)
PASS: Flang :: Driver/pp-fixed-form.f90 (25174 of 97464)
UNRESOLVED: Flang :: Driver/slp-vectorize.ll (25175 of 97464)
******************** TEST 'Flang :: Driver/slp-vectorize.ll' FAILED ********************
Test has no 'RUN:' line
********************
PASS: Flang :: Driver/parse-ir-error.f95 (25176 of 97464)
PASS: Flang :: Driver/mlir-pass-pipeline.f90 (25177 of 97464)
PASS: Flang :: Driver/mabi-riscv.f90 (25178 of 97464)
PASS: Flang :: Driver/fd-lines-as.f90 (25179 of 97464)
PASS: Flang :: Driver/scanning-error.f95 (25180 of 97464)
PASS: Flang :: Driver/include-header.f90 (25181 of 97464)
PASS: Flang :: Driver/linker-flags.f90 (25182 of 97464)
PASS: Flang :: Driver/pass-plugin-not-found.f90 (25183 of 97464)
PASS: Flang :: Driver/multiple-input-files.f90 (25184 of 97464)
PASS: Flang :: Driver/pthread.f90 (25185 of 97464)
PASS: Flang :: Driver/config-file.f90 (25186 of 97464)
PASS: Flang :: Driver/supported-suffices/f03-suffix.f03 (25187 of 97464)
PASS: Clangd Unit Tests :: ./ClangdTests/80/81 (25188 of 97464)
PASS: DataFlowSanitizer-aarch64 :: origin_ldst.c (25189 of 97464)
PASS: Flang :: Driver/print-pipeline-passes.f90 (25190 of 97464)
PASS: Flang :: Driver/mlink-builtin-bc.f90 (25191 of 97464)
PASS: Flang :: Driver/mllvm.f90 (25192 of 97464)
PASS: Flang :: Driver/std2018-wrong.f90 (25193 of 97464)
PASS: Flang :: Driver/target-gpu-features.f90 (25194 of 97464)
PASS: Flang :: Driver/lto-bc.f90 (25195 of 97464)
PASS: Flang :: Driver/no-duplicate-main.f90 (25196 of 97464)
PASS: Flang :: Driver/supported-suffices/f08-suffix.f08 (25197 of 97464)
PASS: Flang :: Driver/input-from-stdin/input-from-stdin.f90 (25198 of 97464)
PASS: Flang :: Driver/print-target-triple.f90 (25199 of 97464)
PASS: Flang :: Driver/tco-code-gen-llvm.fir (25200 of 97464)
PASS: Flang :: Driver/target.f90 (25201 of 97464)
PASS: Flang :: Driver/fsave-optimization-record.f90 (25202 of 97464)
PASS: Flang :: Driver/unsupported-vscale-max-min.f90 (25203 of 97464)
PASS: Flang :: Driver/q-unused-arguments.f90 (25204 of 97464)
PASS: Flang :: Driver/unparse-use-analyzed.f95 (25205 of 97464)
PASS: Flang :: Driver/fixed-line-length.f90 (25206 of 97464)
PASS: Flang :: Driver/fveclib.f90 (25207 of 97464)
PASS: Flang :: Driver/bbc-openmp-version-macro.f90 (25208 of 97464)
PASS: Flang :: Driver/optimization-remark-backend.f90 (25209 of 97464)
PASS: Flang :: Driver/target-machine-error.f90 (25210 of 97464)
PASS: Flang :: Driver/save-temps.f90 (25211 of 97464)

@mingmingl-llvm mingmingl-llvm restored the users/mingmingl-llvm/spr/globalvariables branch March 30, 2025 03:43
mingmingl-llvm added a commit that referenced this pull request Mar 30, 2025
)

This is a follow-up patch of
#125756

In this PR, static-data-splitter pass produces the aggregated profile
counts of constants for constant pools in a global state
(`StateDataProfileInfo`), and asm printer consumes the profile counts to
produce `.hot` or `.unlikely` prefixes.

This implementation covers both x86 and aarch64 asm printer.
@mingmingl-llvm mingmingl-llvm deleted the users/mingmingl-llvm/spr/globalvariables branch March 30, 2025 05:34
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 30, 2025
…oning (#129781)

This is a follow-up patch of
llvm/llvm-project#125756

In this PR, static-data-splitter pass produces the aggregated profile
counts of constants for constant pools in a global state
(`StateDataProfileInfo`), and asm printer consumes the profile counts to
produce `.hot` or `.unlikely` prefixes.

This implementation covers both x86 and aarch64 asm printer.
SchrodingerZhu pushed a commit to SchrodingerZhu/llvm-project that referenced this pull request Mar 31, 2025
…#129781)

This is a follow-up patch of
llvm#125756

In this PR, static-data-splitter pass produces the aggregated profile
counts of constants for constant pools in a global state
(`StateDataProfileInfo`), and asm printer consumes the profile counts to
produce `.hot` or `.unlikely` prefixes.

This implementation covers both x86 and aarch64 asm printer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:analysis Includes value tracking, cost tables and constant folding llvm:ir llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants