Skip to content

[MemProf] Add basic summary section support #141805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 28, 2025

Conversation

teresajohnson
Copy link
Contributor

This patch adds support for a basic MemProf summary section, which is
built along with the indexed MemProf profile (e.g. when reading the raw
or YAML profiles), and serialized through the indexed profile just after
the header.

Currently only 6 fields are written, specifically the number of contexts
(total, cold, hot), and the max context size (cold, warm, hot).

To support forwards and backwards compatibility for added fields in the
indexed profile, the number of fields serialized first. The code is
written to support forwards compatibility (reading newer profiles with
additional summary fields), and comments indicate how to implement
backwards compatibility (reading older profiles with fewer summary
fields) as needed.

Support is added to print the summary as YAML comments when displaying
both the raw and indexed profiles via llvm-profdata show. Because they
are YAML comments, the YAML reader ignores these (the summary is always
recomputed when building the indexed profile as described above).

This necessitated moving some options and a couple of interfaces out of
Analysis/MemoryProfileInfo.cpp and into the new
ProfileData/MemProfSummary.cpp file, as we need to classify context
hotness earlier and also compute context ids to build the summary from
older indexed profiles.

This patch adds support for a basic MemProf summary section, which is
built along with the indexed MemProf profile (e.g. when reading the raw
or YAML profiles), and serialized through the indexed profile just after
the header.

Currently only 6 fields are written, specifically the number of contexts
(total, cold, hot), and the max context size (cold, warm, hot).

To support forwards and backwards compatibility for added fields in the
indexed profile, the number of fields serialized first. The code is
written to support forwards compatibility (reading newer profiles with
additional summary fields), and comments indicate how to implement
backwards compatibility (reading older profiles with fewer summary
fields) as needed.

Support is added to print the summary as YAML comments when displaying
both the raw and indexed profiles via `llvm-profdata show`. Because they
are YAML comments, the YAML reader ignores these (the summary is always
recomputed when building the indexed profile as described above).

This necessitated moving some options and a couple of interfaces out of
Analysis/MemoryProfileInfo.cpp and into the new
ProfileData/MemProfSummary.cpp file, as we need to classify context
hotness earlier and also compute context ids to build the summary from
older indexed profiles.
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels May 28, 2025
@llvmbot
Copy link
Member

llvmbot commented May 28, 2025

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-pgo

Author: Teresa Johnson (teresajohnson)

Changes

This patch adds support for a basic MemProf summary section, which is
built along with the indexed MemProf profile (e.g. when reading the raw
or YAML profiles), and serialized through the indexed profile just after
the header.

Currently only 6 fields are written, specifically the number of contexts
(total, cold, hot), and the max context size (cold, warm, hot).

To support forwards and backwards compatibility for added fields in the
indexed profile, the number of fields serialized first. The code is
written to support forwards compatibility (reading newer profiles with
additional summary fields), and comments indicate how to implement
backwards compatibility (reading older profiles with fewer summary
fields) as needed.

Support is added to print the summary as YAML comments when displaying
both the raw and indexed profiles via llvm-profdata show. Because they
are YAML comments, the YAML reader ignores these (the summary is always
recomputed when building the indexed profile as described above).

This necessitated moving some options and a couple of interfaces out of
Analysis/MemoryProfileInfo.cpp and into the new
ProfileData/MemProfSummary.cpp file, as we need to classify context
hotness earlier and also compute context ids to build the summary from
older indexed profiles.


Patch is 43.27 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141805.diff

20 Files Affected:

  • (modified) llvm/include/llvm/Analysis/MemoryProfileInfo.h (-5)
  • (modified) llvm/include/llvm/ProfileData/IndexedMemProfData.h (+3-2)
  • (modified) llvm/include/llvm/ProfileData/InstrProfReader.h (+10)
  • (modified) llvm/include/llvm/ProfileData/InstrProfWriter.h (+5)
  • (added) llvm/include/llvm/ProfileData/MemProfSummary.h (+70)
  • (added) llvm/include/llvm/ProfileData/MemProfSummaryBuilder.h (+47)
  • (modified) llvm/lib/Analysis/MemoryProfileInfo.cpp (-50)
  • (modified) llvm/lib/ProfileData/CMakeLists.txt (+2)
  • (modified) llvm/lib/ProfileData/IndexedMemProfData.cpp (+17-8)
  • (modified) llvm/lib/ProfileData/InstrProfWriter.cpp (+13-5)
  • (modified) llvm/lib/ProfileData/MemProfReader.cpp (+7)
  • (added) llvm/lib/ProfileData/MemProfSummary.cpp (+148)
  • (added) llvm/lib/ProfileData/MemProfSummaryBuilder.cpp (+61)
  • (modified) llvm/lib/Transforms/Instrumentation/MemProfiler.cpp (+1-13)
  • (modified) llvm/test/Transforms/PGOProfile/memprof.ll (+11)
  • (modified) llvm/test/Transforms/PGOProfile/memprof_undrift_missing_leaf.ll (+2)
  • (modified) llvm/test/tools/llvm-profdata/memprof-yaml.test (+24)
  • (modified) llvm/tools/llvm-profdata/llvm-profdata.cpp (+14)
  • (modified) llvm/unittests/Analysis/MemoryProfileInfoTest.cpp (+1-70)
  • (modified) llvm/unittests/ProfileData/MemProfTest.cpp (+77)
diff --git a/llvm/include/llvm/Analysis/MemoryProfileInfo.h b/llvm/include/llvm/Analysis/MemoryProfileInfo.h
index bf1cfb1ee52bb..d0a9d0b169e99 100644
--- a/llvm/include/llvm/Analysis/MemoryProfileInfo.h
+++ b/llvm/include/llvm/Analysis/MemoryProfileInfo.h
@@ -21,11 +21,6 @@
 namespace llvm {
 namespace memprof {
 
-/// Return the allocation type for a given set of memory profile values.
-LLVM_ABI AllocationType getAllocType(uint64_t TotalLifetimeAccessDensity,
-                                     uint64_t AllocCount,
-                                     uint64_t TotalLifetime);
-
 /// Build callstack metadata from the provided list of call stack ids. Returns
 /// the resulting metadata node.
 LLVM_ABI MDNode *buildCallstackMetadata(ArrayRef<uint64_t> CallStack,
diff --git a/llvm/include/llvm/ProfileData/IndexedMemProfData.h b/llvm/include/llvm/ProfileData/IndexedMemProfData.h
index 2b40094a9bc21..304b1c4734af6 100644
--- a/llvm/include/llvm/ProfileData/IndexedMemProfData.h
+++ b/llvm/include/llvm/ProfileData/IndexedMemProfData.h
@@ -24,6 +24,7 @@
 
 namespace llvm {
 namespace memprof {
+class MemProfSummary;
 struct IndexedMemProfData {
   // A map to hold memprof data per function. The lower 64 bits obtained from
   // the md5 hash of the function name is used to index into the map.
@@ -89,7 +90,7 @@ struct IndexedMemProfData {
 Error writeMemProf(
     ProfOStream &OS, memprof::IndexedMemProfData &MemProfData,
     memprof::IndexedVersion MemProfVersionRequested, bool MemProfFullSchema,
-    std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData);
-
+    std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData,
+    memprof::MemProfSummary *MemProfSum);
 } // namespace llvm
 #endif
diff --git a/llvm/include/llvm/ProfileData/InstrProfReader.h b/llvm/include/llvm/ProfileData/InstrProfReader.h
index d104ab51430d1..99ea3c1808f5e 100644
--- a/llvm/include/llvm/ProfileData/InstrProfReader.h
+++ b/llvm/include/llvm/ProfileData/InstrProfReader.h
@@ -22,6 +22,7 @@
 #include "llvm/ProfileData/InstrProf.h"
 #include "llvm/ProfileData/InstrProfCorrelator.h"
 #include "llvm/ProfileData/MemProf.h"
+#include "llvm/ProfileData/MemProfSummary.h"
 #include "llvm/ProfileData/MemProfYAML.h"
 #include "llvm/Support/Endian.h"
 #include "llvm/Support/Error.h"
@@ -690,6 +691,8 @@ class IndexedMemProfReader {
   /// The MemProf version.
   memprof::IndexedVersion Version =
       static_cast<memprof::IndexedVersion>(memprof::MinimumSupportedVersion);
+  /// MemProf summary (if available, version >= 4).
+  std::unique_ptr<memprof::MemProfSummary> MemProfSum;
   /// MemProf profile schema (if available).
   memprof::MemProfSchema Schema;
   /// MemProf record profile data on-disk indexed via llvm::md5(FunctionName).
@@ -725,6 +728,8 @@ class IndexedMemProfReader {
 
   // Return the entire MemProf profile.
   memprof::AllMemProfData getAllMemProfData() const;
+
+  memprof::MemProfSummary *getSummary() const { return MemProfSum.get(); }
 };
 
 /// Reader for the indexed binary instrprof format.
@@ -887,6 +892,11 @@ class IndexedInstrProfReader : public InstrProfReader {
     }
   }
 
+  /// Return the MemProf summary. Will be null if unavailable (version < 4).
+  memprof::MemProfSummary *getMemProfSummary() const {
+    return MemProfReader.getSummary();
+  }
+
   Error readBinaryIds(std::vector<llvm::object::BuildID> &BinaryIds) override;
   Error printBinaryIds(raw_ostream &OS) override;
 };
diff --git a/llvm/include/llvm/ProfileData/InstrProfWriter.h b/llvm/include/llvm/ProfileData/InstrProfWriter.h
index cdb7afb623378..8bf1efffc7c8c 100644
--- a/llvm/include/llvm/ProfileData/InstrProfWriter.h
+++ b/llvm/include/llvm/ProfileData/InstrProfWriter.h
@@ -22,6 +22,7 @@
 #include "llvm/ProfileData/DataAccessProf.h"
 #include "llvm/ProfileData/IndexedMemProfData.h"
 #include "llvm/ProfileData/InstrProf.h"
+#include "llvm/ProfileData/MemProfSummaryBuilder.h"
 #include "llvm/Support/Error.h"
 #include <cstdint>
 #include <memory>
@@ -84,6 +85,10 @@ class InstrProfWriter {
 
   std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData;
 
+  // MemProf ummary builder to which records are added as MemProf data is added
+  // to the writer.
+  memprof::MemProfSummaryBuilder MemProfSumBuilder;
+
 public:
   // For memprof testing, random hotness can be assigned to the contexts if
   // MemprofGenerateRandomHotness is enabled. The random seed can be either
diff --git a/llvm/include/llvm/ProfileData/MemProfSummary.h b/llvm/include/llvm/ProfileData/MemProfSummary.h
new file mode 100644
index 0000000000000..c65c04dd1f85c
--- /dev/null
+++ b/llvm/include/llvm/ProfileData/MemProfSummary.h
@@ -0,0 +1,70 @@
+//===- MemProfSummary.h - MemProf summary support ---------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains MemProf summary support and related interfaces.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_PROFILEDATA_MEMPROFSUMMARY_H
+#define LLVM_PROFILEDATA_MEMPROFSUMMARY_H
+
+#include "llvm/IR/ModuleSummaryIndex.h"
+#include "llvm/ProfileData/InstrProf.h"
+#include "llvm/ProfileData/MemProf.h"
+
+namespace llvm {
+namespace memprof {
+
+/// Return the allocation type for a given set of memory profile values.
+AllocationType getAllocType(uint64_t TotalLifetimeAccessDensity,
+                            uint64_t AllocCount, uint64_t TotalLifetime);
+
+/// Helper to generate a single hash id for a given callstack, used for emitting
+/// matching statistics and useful for uniquing such statistics across modules.
+/// Also used to dedup contexts when computing the summary.
+uint64_t computeFullStackId(ArrayRef<Frame> CallStack);
+
+class MemProfSummary {
+private:
+  /// The number of summary fields below, which is used to enable some forwards
+  /// and backwards compatibility for the summary when serialized in the indexed
+  /// MemProf format. As long as no existing summary fields are removed or
+  /// reordered, and new summary fields are added after existing summary fields,
+  /// the MemProf indexed profile version does not need to be bumped to
+  /// accommodate new summary fields.
+  static const unsigned NumSummaryFields = 6;
+
+  const uint64_t NumContexts, NumColdContexts, NumHotContexts;
+  const uint64_t MaxColdTotalSize, MaxWarmTotalSize, MaxHotTotalSize;
+
+public:
+  MemProfSummary(uint64_t NumContexts, uint64_t NumColdContexts,
+                 uint64_t NumHotContexts, uint64_t MaxColdTotalSize,
+                 uint64_t MaxWarmTotalSize, uint64_t MaxHotTotalSize)
+      : NumContexts(NumContexts), NumColdContexts(NumColdContexts),
+        NumHotContexts(NumHotContexts), MaxColdTotalSize(MaxColdTotalSize),
+        MaxWarmTotalSize(MaxWarmTotalSize), MaxHotTotalSize(MaxHotTotalSize) {}
+
+  static unsigned getNumSummaryFields() { return NumSummaryFields; }
+  uint64_t getNumContexts() const { return NumContexts; }
+  uint64_t getNumColdContexts() const { return NumColdContexts; }
+  uint64_t getNumHotContexts() const { return NumHotContexts; }
+  uint64_t getMaxColdTotalSize() const { return MaxColdTotalSize; }
+  uint64_t getMaxWarmTotalSize() const { return MaxWarmTotalSize; }
+  uint64_t getMaxHotTotalSize() const { return MaxHotTotalSize; }
+  void printSummaryYaml(raw_ostream &OS) const;
+  /// Write to indexed MemProf profile.
+  void write(ProfOStream &OS) const;
+  /// Read from indexed MemProf profile.
+  static std::unique_ptr<MemProfSummary> deserialize(const unsigned char *&);
+};
+
+} // namespace memprof
+} // namespace llvm
+
+#endif // LLVM_PROFILEDATA_MEMPROFSUMMARY_H
diff --git a/llvm/include/llvm/ProfileData/MemProfSummaryBuilder.h b/llvm/include/llvm/ProfileData/MemProfSummaryBuilder.h
new file mode 100644
index 0000000000000..61cc46cfbc214
--- /dev/null
+++ b/llvm/include/llvm/ProfileData/MemProfSummaryBuilder.h
@@ -0,0 +1,47 @@
+//===- MemProfSummaryBuilder.h - MemProf summary building -------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains MemProf summary builder.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_PROFILEDATA_MEMPROFSUMMARYBUILDER_H
+#define LLVM_PROFILEDATA_MEMPROFSUMMARYBUILDER_H
+
+#include "llvm/ProfileData/MemProf.h"
+#include "llvm/ProfileData/MemProfSummary.h"
+
+namespace llvm {
+namespace memprof {
+
+class MemProfSummaryBuilder {
+private:
+  DenseSet<uint64_t> Contexts;
+  void addRecord(uint64_t, const PortableMemInfoBlock &);
+
+protected:
+  uint64_t MaxColdTotalSize = 0;
+  uint64_t MaxWarmTotalSize = 0;
+  uint64_t MaxHotTotalSize = 0;
+  uint64_t NumContexts = 0;
+  uint64_t NumColdContexts = 0;
+  uint64_t NumHotContexts = 0;
+
+public:
+  MemProfSummaryBuilder() = default;
+  ~MemProfSummaryBuilder() = default;
+
+  void addRecord(const IndexedMemProfRecord &);
+  void addRecord(const MemProfRecord &);
+  std::unique_ptr<MemProfSummary> getSummary();
+};
+
+} // namespace memprof
+} // namespace llvm
+
+#endif // LLVM_PROFILEDATA_MEMPROFSUMMARYBUILDER_H
diff --git a/llvm/lib/Analysis/MemoryProfileInfo.cpp b/llvm/lib/Analysis/MemoryProfileInfo.cpp
index 773d0b2f53e09..263241c3edf5a 100644
--- a/llvm/lib/Analysis/MemoryProfileInfo.cpp
+++ b/llvm/lib/Analysis/MemoryProfileInfo.cpp
@@ -21,34 +21,6 @@ using namespace llvm::memprof;
 
 #define DEBUG_TYPE "memory-profile-info"
 
-// Upper bound on lifetime access density (accesses per byte per lifetime sec)
-// for marking an allocation cold.
-LLVM_ABI cl::opt<float> MemProfLifetimeAccessDensityColdThreshold(
-    "memprof-lifetime-access-density-cold-threshold", cl::init(0.05),
-    cl::Hidden,
-    cl::desc("The threshold the lifetime access density (accesses per byte per "
-             "lifetime sec) must be under to consider an allocation cold"));
-
-// Lower bound on lifetime to mark an allocation cold (in addition to accesses
-// per byte per sec above). This is to avoid pessimizing short lived objects.
-LLVM_ABI cl::opt<unsigned> MemProfAveLifetimeColdThreshold(
-    "memprof-ave-lifetime-cold-threshold", cl::init(200), cl::Hidden,
-    cl::desc("The average lifetime (s) for an allocation to be considered "
-             "cold"));
-
-// Lower bound on average lifetime accesses density (total life time access
-// density / alloc count) for marking an allocation hot.
-LLVM_ABI cl::opt<unsigned> MemProfMinAveLifetimeAccessDensityHotThreshold(
-    "memprof-min-ave-lifetime-access-density-hot-threshold", cl::init(1000),
-    cl::Hidden,
-    cl::desc("The minimum TotalLifetimeAccessDensity / AllocCount for an "
-             "allocation to be considered hot"));
-
-LLVM_ABI cl::opt<bool>
-    MemProfUseHotHints("memprof-use-hot-hints", cl::init(false), cl::Hidden,
-                       cl::desc("Enable use of hot hints (only supported for "
-                                "unambigously hot allocations)"));
-
 cl::opt<bool> MemProfReportHintedSizes(
     "memprof-report-hinted-sizes", cl::init(false), cl::Hidden,
     cl::desc("Report total allocation sizes of hinted allocations"));
@@ -73,28 +45,6 @@ cl::opt<unsigned> MinCallsiteColdBytePercent(
     cl::desc("Min percent of cold bytes at a callsite to discard non-cold "
              "contexts"));
 
-AllocationType llvm::memprof::getAllocType(uint64_t TotalLifetimeAccessDensity,
-                                           uint64_t AllocCount,
-                                           uint64_t TotalLifetime) {
-  // The access densities are multiplied by 100 to hold 2 decimal places of
-  // precision, so need to divide by 100.
-  if (((float)TotalLifetimeAccessDensity) / AllocCount / 100 <
-          MemProfLifetimeAccessDensityColdThreshold
-      // Lifetime is expected to be in ms, so convert the threshold to ms.
-      && ((float)TotalLifetime) / AllocCount >=
-             MemProfAveLifetimeColdThreshold * 1000)
-    return AllocationType::Cold;
-
-  // The access densities are multiplied by 100 to hold 2 decimal places of
-  // precision, so need to divide by 100.
-  if (MemProfUseHotHints &&
-      ((float)TotalLifetimeAccessDensity) / AllocCount / 100 >
-          MemProfMinAveLifetimeAccessDensityHotThreshold)
-    return AllocationType::Hot;
-
-  return AllocationType::NotCold;
-}
-
 MDNode *llvm::memprof::buildCallstackMetadata(ArrayRef<uint64_t> CallStack,
                                               LLVMContext &Ctx) {
   SmallVector<Metadata *, 8> StackVals;
diff --git a/llvm/lib/ProfileData/CMakeLists.txt b/llvm/lib/ProfileData/CMakeLists.txt
index ca9ea3205ee1d..b1a0d8d707287 100644
--- a/llvm/lib/ProfileData/CMakeLists.txt
+++ b/llvm/lib/ProfileData/CMakeLists.txt
@@ -10,6 +10,8 @@ add_llvm_component_library(LLVMProfileData
   MemProf.cpp
   MemProfReader.cpp
   MemProfRadixTree.cpp
+  MemProfSummary.cpp
+  MemProfSummaryBuilder.cpp
   PGOCtxProfReader.cpp
   PGOCtxProfWriter.cpp
   ProfileSummaryBuilder.cpp
diff --git a/llvm/lib/ProfileData/IndexedMemProfData.cpp b/llvm/lib/ProfileData/IndexedMemProfData.cpp
index 7398e4c468bbe..25786cc6969d6 100644
--- a/llvm/lib/ProfileData/IndexedMemProfData.cpp
+++ b/llvm/lib/ProfileData/IndexedMemProfData.cpp
@@ -15,6 +15,7 @@
 #include "llvm/ProfileData/InstrProfReader.h"
 #include "llvm/ProfileData/MemProf.h"
 #include "llvm/ProfileData/MemProfRadixTree.h"
+#include "llvm/ProfileData/MemProfSummary.h"
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/OnDiskHashTable.h"
 
@@ -220,7 +221,8 @@ static Error writeMemProfRadixTreeBased(
     ProfOStream &OS, memprof::IndexedMemProfData &MemProfData,
     memprof::IndexedVersion Version, bool MemProfFullSchema,
     std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData =
-        nullptr) {
+        nullptr,
+    memprof::MemProfSummary *MemProfSum = nullptr) {
   assert((Version == memprof::Version3 || Version == memprof::Version4) &&
          "Unsupported version for radix tree format");
 
@@ -232,6 +234,9 @@ static Error writeMemProfRadixTreeBased(
   if (Version >= memprof::Version4)
     OS.write(0ULL); // Reserve space for the data access profile offset.
 
+  if (Version == memprof::Version4)
+    MemProfSum->write(OS);
+
   auto Schema = memprof::getHotColdSchema();
   if (MemProfFullSchema)
     Schema = memprof::getFullSchema();
@@ -297,17 +302,19 @@ static Error writeMemProfV3(ProfOStream &OS,
 static Error writeMemProfV4(
     ProfOStream &OS, memprof::IndexedMemProfData &MemProfData,
     bool MemProfFullSchema,
-    std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData) {
-  return writeMemProfRadixTreeBased(OS, MemProfData, memprof::Version4,
-                                    MemProfFullSchema,
-                                    std::move(DataAccessProfileData));
+    std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData,
+    memprof::MemProfSummary *MemProfSum) {
+  return writeMemProfRadixTreeBased(
+      OS, MemProfData, memprof::Version4, MemProfFullSchema,
+      std::move(DataAccessProfileData), MemProfSum);
 }
 
 // Write out the MemProf data in a requested version.
 Error writeMemProf(
     ProfOStream &OS, memprof::IndexedMemProfData &MemProfData,
     memprof::IndexedVersion MemProfVersionRequested, bool MemProfFullSchema,
-    std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData) {
+    std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData,
+    memprof::MemProfSummary *MemProfSum) {
   switch (MemProfVersionRequested) {
   case memprof::Version2:
     return writeMemProfV2(OS, MemProfData, MemProfFullSchema);
@@ -315,7 +322,7 @@ Error writeMemProf(
     return writeMemProfV3(OS, MemProfData, MemProfFullSchema);
   case memprof::Version4:
     return writeMemProfV4(OS, MemProfData, MemProfFullSchema,
-                          std::move(DataAccessProfileData));
+                          std::move(DataAccessProfileData), MemProfSum);
   }
 
   return make_error<InstrProfError>(
@@ -395,9 +402,11 @@ Error IndexedMemProfReader::deserializeRadixTreeBased(
       support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr);
 
   uint64_t DataAccessProfOffset = 0;
-  if (Version == memprof::Version4)
+  if (Version == memprof::Version4) {
     DataAccessProfOffset =
         support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr);
+    MemProfSum = memprof::MemProfSummary::deserialize(Ptr);
+  }
 
   // Read the schema.
   auto SchemaOr = memprof::readMemProfSchema(Ptr);
diff --git a/llvm/lib/ProfileData/InstrProfWriter.cpp b/llvm/lib/ProfileData/InstrProfWriter.cpp
index 039e1bc955cd4..7a4981124762c 100644
--- a/llvm/lib/ProfileData/InstrProfWriter.cpp
+++ b/llvm/lib/ProfileData/InstrProfWriter.cpp
@@ -240,6 +240,7 @@ void InstrProfWriter::addMemProfRecord(
       Alloc.Info.setTotalLifetime(NewTL);
     }
   }
+  MemProfSumBuilder.addRecord(NewRecord);
   auto [Iter, Inserted] = MemProfData.Records.insert({Id, NewRecord});
   // If we inserted a new record then we are done.
   if (Inserted) {
@@ -308,11 +309,16 @@ bool InstrProfWriter::addMemProfData(memprof::IndexedMemProfData Incoming,
         return false;
 
   // Add one record at a time if randomization is requested.
-  if (MemProfData.Records.empty() && !MemprofGenerateRandomHotness)
+  if (MemProfData.Records.empty() && !MemprofGenerateRandomHotness) {
+    // Need to manually add each record to the builder, which is otherwise done
+    // in addMemProfRecord.
+    for (const auto &[GUID, Record] : Incoming.Records)
+      MemProfSumBuilder.addRecord(Record);
     MemProfData.Records = std::move(Incoming.Records);
-  else
+  } else {
     for (const auto &[GUID, Record] : Incoming.Records)
       addMemProfRecord(GUID, Record);
+  }
 
   return true;
 }
@@ -612,10 +618,12 @@ Error InstrProfWriter::writeImpl(ProfOStream &OS) {
   if (static_cast<bool>(ProfileKind & InstrProfKind::MemProf)) {
     MemProfSectionStart = OS.tell();
 
-    if (auto E =
-            writeMemProf(OS, MemProfData, MemProfVersionRequested,
-                         MemProfFullSchema, std::move(DataAccessProfileData)))
+    // Get the finalized MemProf summary that was built when adding records.
+    auto MemProfSum = MemProfSumBuilder.getSummary();
 
+    if (auto E = writeMemProf(
+            OS, MemProfData, MemProfVersionRequested, MemProfFullSchema,
+            std::move(DataAccessProfileData), MemProfSum.get()))
       return E;
   }
 
diff --git a/llvm/lib/ProfileData/MemProfReader.cpp b/llvm/lib/ProfileData/MemProfReader.cpp
index 9c723e495e7f9..235b1347e0077 100644
--- a/llvm/lib/ProfileData/MemProfReader.cpp
+++ b/llvm/lib/ProfileData/MemProfReader.cpp
@@ -32,6 +32,7 @@
 #include "llvm/ProfileData/MemProf.h"
 #include "llvm/ProfileData/MemProfData.inc"
 #include "llvm/ProfileData/MemProfReader.h"
+#include "llvm/ProfileData/MemProfSummaryBuilder.h"
 #include "llvm/ProfileData/MemProfYAML.h"
 #include "llvm/ProfileData/SampleProf.h"
 #include "llvm/Support/Debug.h"
@@ -306,8 +307,10 @@ bool RawMemProfReader::hasFormat(const MemoryBuffer &Buffer) {
 }
 
 void RawMemProfReader::printYAML(raw_ostream &OS) {
+  MemProfSummaryBuilder MemProfSumBuilder;
   uint64_t NumAllocFunctions = 0, NumMibInfo = 0;
   for (const auto &KV : MemProfData.Records) {
+    MemProfSumBuilder.addRecord(KV.second);
     const size_t NumAllocSites = KV.second.AllocSites.size();
     if (NumAllocSites > 0) {
       NumAllocFunctions++;
@@ -315,6 +318,10 @@ void RawMemProfReader::printYAML(raw_ostream &OS) {
     }
   }
 
+  // Print the summary first, as it is printed as YAML comments.
+  auto MemProfSum = MemProfSumBuilder.getSummary();
+  MemProfSum->printSummaryYaml(OS);
+
   OS << "MemprofProfile:\n";
   OS << "  Summary:\n";
   OS << "    Version: " << MemprofRawVersion << "\n";
diff --git a/llvm/lib/ProfileData/MemProfSummary.cpp b/llvm/lib/ProfileData/MemProfSummary.cpp
new file mode 100644
index 0000000000000..ac1396bfcbfd0
--- /dev/null
+++ b/llvm/lib/ProfileData/MemProfSummary.cpp
@@ -0...
[truncated]

- fix comment typo
- combine version check in writer with earlier one for the data access
profile offset, which also corrects it to a >= v4 check
- fix the corresponding (pre-existing) check in the reader to also be >=
  v4 instead of == v4.
Copy link
Contributor

@kazutakahirata kazutakahirata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo some minor comments. Thanks!


class MemProfSummaryBuilder {
private:
DenseSet<uint64_t> Contexts;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest some comment here? Something like:

// The set of full stack IDs that we've processed so far.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/// reordered, and new summary fields are added after existing summary fields,
/// the MemProf indexed profile version does not need to be bumped to
/// accommodate new summary fields.
static const unsigned NumSummaryFields = 6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I suggest static constexpr unsigned? This is a compile-time constant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 3320 to 3321
auto *MemProfSum = Reader->getMemProfSummary();
if (MemProfSum) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Combine like so:

Suggested change
auto *MemProfSum = Reader->getMemProfSummary();
if (MemProfSum) {
if (auto *MemProfSum = Reader->getMemProfSummary()) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

NumHotContexts(NumHotContexts), MaxColdTotalSize(MaxColdTotalSize),
MaxWarmTotalSize(MaxWarmTotalSize), MaxHotTotalSize(MaxHotTotalSize) {}

static unsigned getNumSummaryFields() { return NumSummaryFields; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: static constexpr unsigned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 131 to 145
auto MemProfSum = std::make_unique<MemProfSummary>(
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr));
// Sanity check that the number of fields specified in summary was kept in
// sync with the fields being read and saved.
assert((Ptr - StartPos) / 8 == NumFieldsReadAndSaved);

// Enable forwards compatibility by reading and discarding any additional
// fields in the profile's summary.
while (NumSummaryFields-- > MemProfSummary::getNumSummaryFields())
(void)support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the order of argument evaluation is not guaranteed in C++.

One way to work around that is to use read instead of readNext:

support::endian::read<uint64_t, llvm::endianness::little>(Ptr),
support::endian::read<uint64_t, llvm::endianness::little>(Ptr + 8),
support::endian::read<uint64_t, llvm::endianness::little>(Ptr + 16),
...

Then you can replace the assert and the while loop with:

Ptr += NumSummaryFields * sizeof(uint64_t);

while removing StartPos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch on readNext possible issues. Simplified as suggested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, there is a failure with one of the tests on windows due to the fields being out of order, hopefully will be fixed by this change.

Copy link
Contributor Author

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the review!

/// reordered, and new summary fields are added after existing summary fields,
/// the MemProf indexed profile version does not need to be bumped to
/// accommodate new summary fields.
static const unsigned NumSummaryFields = 6;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

NumHotContexts(NumHotContexts), MaxColdTotalSize(MaxColdTotalSize),
MaxWarmTotalSize(MaxWarmTotalSize), MaxHotTotalSize(MaxHotTotalSize) {}

static unsigned getNumSummaryFields() { return NumSummaryFields; }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


class MemProfSummaryBuilder {
private:
DenseSet<uint64_t> Contexts;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 131 to 145
auto MemProfSum = std::make_unique<MemProfSummary>(
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr));
// Sanity check that the number of fields specified in summary was kept in
// sync with the fields being read and saved.
assert((Ptr - StartPos) / 8 == NumFieldsReadAndSaved);

// Enable forwards compatibility by reading and discarding any additional
// fields in the profile's summary.
while (NumSummaryFields-- > MemProfSummary::getNumSummaryFields())
(void)support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch on readNext possible issues. Simplified as suggested.

Comment on lines 3320 to 3321
auto *MemProfSum = Reader->getMemProfSummary();
if (MemProfSum) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 131 to 145
auto MemProfSum = std::make_unique<MemProfSummary>(
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr),
support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr));
// Sanity check that the number of fields specified in summary was kept in
// sync with the fields being read and saved.
assert((Ptr - StartPos) / 8 == NumFieldsReadAndSaved);

// Enable forwards compatibility by reading and discarding any additional
// fields in the profile's summary.
while (NumSummaryFields-- > MemProfSummary::getNumSummaryFields())
(void)support::endian::readNext<uint64_t, llvm::endianness::little>(Ptr);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, there is a failure with one of the tests on windows due to the fields being out of order, hopefully will be fixed by this change.

@teresajohnson teresajohnson merged commit cc6f446 into llvm:main May 28, 2025
7 of 10 checks passed
std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData);

std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData,
memprof::MemProfSummary *MemProfSum);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to keep the memprof summary around after we call writeMemProf, since it's a unique_ptr already can we just move it here instead of passing a pointer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

// instances of the same allocations.
DenseSet<uint64_t> Contexts;

void addRecord(uint64_t, const PortableMemInfoBlock &);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this one private and not the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called by the other 2. I'll add a comment to clarify


void addRecord(uint64_t, const PortableMemInfoBlock &);

protected:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these protected? I don't see a usage where this is needed. Did I miss something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is just leftover from something else, I will remove.

writeMemProf(OS, MemProfData, MemProfVersionRequested,
MemProfFullSchema, std::move(DataAccessProfileData)))
// Get the finalized MemProf summary that was built when adding records.
auto MemProfSum = MemProfSumBuilder.getSummary();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline this into the call if you choose to change writeMemProf to take unique_ptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@@ -35,6 +35,7 @@
#include "llvm/IR/Value.h"
#include "llvm/ProfileData/InstrProf.h"
#include "llvm/ProfileData/InstrProfReader.h"
#include "llvm/ProfileData/MemProfSummary.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little strange to have computeStackId in MemProfSummary.h (which I assume is the only reason we need to include this here. Maybe we can refactor these into a common library in the future (accounting for the module build deps).

Can you add a FIXME here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need getAllocType, which I had moved to MemProfSummary.cpp. I went ahead and created a new MemProfCommon.[h,cpp] and put computeStackId and getAllocType there, since that was straightforward and is cleaner overall.

#include "llvm/Support/raw_ostream.h"
#include "gmock/gmock.h"
#include "gtest/gtest.h"

#include <initializer_list>

using namespace llvm;

extern cl::opt<float> MemProfLifetimeAccessDensityColdThreshold;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move these extern decls inside the llvm namespace below instead of adding a broad using directive above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This turned out to not work, because the variables themselves are not defined in the llvm namespace. I was able to replace the using namespace with just adding "llvm::" qualifiers in front of the cl::opt types.

@llvm-ci
Copy link
Collaborator

llvm-ci commented May 29, 2025

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-fast running on sanitizer-buildbot4 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/169/builds/11823

Here is the relevant piece of the build log for the reference
Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 89936 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80..
FAIL: LLVM :: ExecutionEngine/JITLink/x86-64/MachO_linker_private_symbols.s (44986 of 89936)
******************** TEST 'LLVM :: ExecutionEngine/JITLink/x86-64/MachO_linker_private_symbols.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
rm -rf /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp && mkdir -p /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp # RUN: at line 1
+ rm -rf /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp
+ mkdir -p /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj    -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/global_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/Inputs/MachO_global_linker_private_def.s # RUN: at line 2
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/global_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/Inputs/MachO_global_linker_private_def.s
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj    -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/internal_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/Inputs/MachO_internal_linker_private_def.s # RUN: at line 4
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/internal_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/Inputs/MachO_internal_linker_private_def.s
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj    -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/MachO_linker_private_symbols.s # RUN: at line 6
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/MachO_linker_private_symbols.s
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-jitlink -noexec /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/global_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o # RUN: at line 8
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-jitlink -noexec /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/global_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o
not /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-jitlink -noexec /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/internal_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o # RUN: at line 9
+ not /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-jitlink -noexec /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/internal_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o
llvm-jitlink error: Symbols not found: [ l_foo ]
libc++abi: Pure virtual function called!
error: Aborted

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Slowest Tests:
--------------------------------------------------------------------------
465.15s: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
386.34s: Clang :: Driver/fsanitize.c
333.98s: LLVM :: CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
316.09s: Clang :: Preprocessor/riscv-target-features.c
247.26s: LLVM :: CodeGen/AMDGPU/memintrinsic-unroll.ll
243.17s: Clang :: Driver/arm-cortex-cpus-2.c
229.99s: Clang :: OpenMP/target_update_codegen.cpp
224.45s: Clang :: OpenMP/target_defaultmap_codegen_01.cpp
221.16s: Clang :: Preprocessor/aarch64-target-features.c
221.10s: Clang :: Driver/arm-cortex-cpus-1.c
Step 10 (stage2/asan_ubsan check) failure: stage2/asan_ubsan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using lld-link: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:520: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 89936 tests, 88 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80..
FAIL: LLVM :: ExecutionEngine/JITLink/x86-64/MachO_linker_private_symbols.s (44986 of 89936)
******************** TEST 'LLVM :: ExecutionEngine/JITLink/x86-64/MachO_linker_private_symbols.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
rm -rf /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp && mkdir -p /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp # RUN: at line 1
+ rm -rf /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp
+ mkdir -p /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj    -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/global_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/Inputs/MachO_global_linker_private_def.s # RUN: at line 2
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/global_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/Inputs/MachO_global_linker_private_def.s
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj    -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/internal_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/Inputs/MachO_internal_linker_private_def.s # RUN: at line 4
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/internal_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/Inputs/MachO_internal_linker_private_def.s
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj    -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/MachO_linker_private_symbols.s # RUN: at line 6
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-mc -triple=x86_64-apple-macosx10.9 -filetype=obj -o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o /home/b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/ExecutionEngine/JITLink/x86-64/MachO_linker_private_symbols.s
/home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-jitlink -noexec /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/global_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o # RUN: at line 8
+ /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-jitlink -noexec /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/global_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o
not /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-jitlink -noexec /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/internal_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o # RUN: at line 9
+ not /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/bin/llvm-jitlink -noexec /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/internal_lp_def.o /home/b/sanitizer-x86_64-linux-fast/build/llvm_build_asan_ubsan/test/ExecutionEngine/JITLink/x86-64/Output/MachO_linker_private_symbols.s.tmp/macho_lp_test.o
llvm-jitlink error: Symbols not found: [ l_foo ]
libc++abi: Pure virtual function called!
error: Aborted

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
Slowest Tests:
--------------------------------------------------------------------------
465.15s: LLVM :: CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
386.34s: Clang :: Driver/fsanitize.c
333.98s: LLVM :: CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
316.09s: Clang :: Preprocessor/riscv-target-features.c
247.26s: LLVM :: CodeGen/AMDGPU/memintrinsic-unroll.ll
243.17s: Clang :: Driver/arm-cortex-cpus-2.c
229.99s: Clang :: OpenMP/target_update_codegen.cpp
224.45s: Clang :: OpenMP/target_defaultmap_codegen_01.cpp
221.16s: Clang :: Preprocessor/aarch64-target-features.c
221.10s: Clang :: Driver/arm-cortex-cpus-1.c

Copy link
Contributor Author

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments, I'll send a PR tomorrow to address them

// instances of the same allocations.
DenseSet<uint64_t> Contexts;

void addRecord(uint64_t, const PortableMemInfoBlock &);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called by the other 2. I'll add a comment to clarify


void addRecord(uint64_t, const PortableMemInfoBlock &);

protected:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is just leftover from something else, I will remove.

#include "llvm/Support/raw_ostream.h"
#include "gmock/gmock.h"
#include "gtest/gtest.h"

#include <initializer_list>

using namespace llvm;

extern cl::opt<float> MemProfLifetimeAccessDensityColdThreshold;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData);

std::unique_ptr<memprof::DataAccessProfData> DataAccessProfileData,
memprof::MemProfSummary *MemProfSum);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

writeMemProf(OS, MemProfData, MemProfVersionRequested,
MemProfFullSchema, std::move(DataAccessProfileData)))
// Get the finalized MemProf summary that was built when adding records.
auto MemProfSum = MemProfSumBuilder.getSummary();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@@ -35,6 +35,7 @@
#include "llvm/IR/Value.h"
#include "llvm/ProfileData/InstrProf.h"
#include "llvm/ProfileData/InstrProfReader.h"
#include "llvm/ProfileData/MemProfSummary.h"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@teresajohnson
Copy link
Contributor Author

Thanks for the comments, I'll send a PR tomorrow to address them

#142003

google-yfyang pushed a commit to google-yfyang/llvm-project that referenced this pull request May 29, 2025
This patch adds support for a basic MemProf summary section, which is
built along with the indexed MemProf profile (e.g. when reading the raw
or YAML profiles), and serialized through the indexed profile just after
the header.

Currently only 6 fields are written, specifically the number of contexts
(total, cold, hot), and the max context size (cold, warm, hot).

To support forwards and backwards compatibility for added fields in the
indexed profile, the number of fields serialized first. The code is
written to support forwards compatibility (reading newer profiles with
additional summary fields), and comments indicate how to implement
backwards compatibility (reading older profiles with fewer summary
fields) as needed.

Support is added to print the summary as YAML comments when displaying
both the raw and indexed profiles via `llvm-profdata show`. Because they
are YAML comments, the YAML reader ignores these (the summary is always
recomputed when building the indexed profile as described above).

This necessitated moving some options and a couple of interfaces out of
Analysis/MemoryProfileInfo.cpp and into the new
ProfileData/MemProfSummary.cpp file, as we need to classify context
hotness earlier and also compute context ids to build the summary from
older indexed profiles.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025
This patch adds support for a basic MemProf summary section, which is
built along with the indexed MemProf profile (e.g. when reading the raw
or YAML profiles), and serialized through the indexed profile just after
the header.

Currently only 6 fields are written, specifically the number of contexts
(total, cold, hot), and the max context size (cold, warm, hot).

To support forwards and backwards compatibility for added fields in the
indexed profile, the number of fields serialized first. The code is
written to support forwards compatibility (reading newer profiles with
additional summary fields), and comments indicate how to implement
backwards compatibility (reading older profiles with fewer summary
fields) as needed.

Support is added to print the summary as YAML comments when displaying
both the raw and indexed profiles via `llvm-profdata show`. Because they
are YAML comments, the YAML reader ignores these (the summary is always
recomputed when building the indexed profile as described above).

This necessitated moving some options and a couple of interfaces out of
Analysis/MemoryProfileInfo.cpp and into the new
ProfileData/MemProfSummary.cpp file, as we need to classify context
hotness earlier and also compute context ids to build the summary from
older indexed profiles.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants