Skip to content

[analyzer] Introduce per-entry-point statistics #131175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Mar 17, 2025

Conversation

necto
Copy link
Contributor

@necto necto commented Mar 13, 2025

So far CSA was relying on the LLVM Statistic package that allowed us to gather some data about analysis of an entire translation unit. However, the translation unit consists of a collection of loosely related entry points. Aggregating data across multiple such entry points is often counter productive.

This change introduces a new lightweight always-on facility to collect Boolean or numerical statistics for each entry point and dump them in a CSV format. Such format makes it easy to aggregate data across multiple translation units and analyze it with common data-processing tools.

We break down the existing statistics that were collected on the per-TU basis into values per entry point.

Additionally, we enable the statistics unconditionally (STATISTIC -> ALWAYS_ENABLED_STATISTIC) to facilitate their use (you can gather the data with a simple run-time flag rather than having to recompile the analyzer). These statistics are very light and add virtually no overhead.


CPP-6160

So far CSA was relying on the LLVM Statistic package that allowed us to
gather some data about analysis of an entire translation unit. However,
the translation unit consists of a collection of loosely related entry
points. Aggregating data across multiple such entry points is often
counter productive.

This change introduces a new lightweight always-on facility to collect
Boolean or numerical statistics for each entry point and dump them in a
CSV format. Such format makes it easy to aggregate data across multiple
translation units and analyze it with common data-processing tools.

We break down the existing statistics that were collected on the per-TU
basis into values per entry point.

Additionally, we enable the statistics unconditionally (STATISTIC ->
ALWAYS_ENABLED_STATISTIC) to facilitate their use (you can gather the
data with a simple run-time flag rather than having to recompile the
analyzer). These statistics are very light and add virtually no
overhead.

@steakhal (Balázs Benics) started this design and I picked over the baton.

---
CPP-6160
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Mar 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 13, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Arseniy Zaostrovnykh (necto)

Changes

So far CSA was relying on the LLVM Statistic package that allowed us to gather some data about analysis of an entire translation unit. However, the translation unit consists of a collection of loosely related entry points. Aggregating data across multiple such entry points is often counter productive.

This change introduces a new lightweight always-on facility to collect Boolean or numerical statistics for each entry point and dump them in a CSV format. Such format makes it easy to aggregate data across multiple translation units and analyze it with common data-processing tools.

We break down the existing statistics that were collected on the per-TU basis into values per entry point.

Additionally, we enable the statistics unconditionally (STATISTIC -> ALWAYS_ENABLED_STATISTIC) to facilitate their use (you can gather the data with a simple run-time flag rather than having to recompile the analyzer). These statistics are very light and add virtually no overhead.

@steakhal (Balázs Benics) started this design and I picked over the baton.


CPP-6160


Patch is 35.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/131175.diff

17 Files Affected:

  • (modified) clang/docs/analyzer/developer-docs.rst (+1)
  • (added) clang/docs/analyzer/developer-docs/Statistics.rst (+21)
  • (modified) clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def (+6)
  • (added) clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h (+162)
  • (modified) clang/lib/StaticAnalyzer/Checkers/AnalyzerStatsChecker.cpp (+4-5)
  • (modified) clang/lib/StaticAnalyzer/Core/BugReporter.cpp (+14-14)
  • (modified) clang/lib/StaticAnalyzer/Core/CMakeLists.txt (+1)
  • (modified) clang/lib/StaticAnalyzer/Core/CoreEngine.cpp (+7-9)
  • (added) clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp (+201)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+13-11)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp (+7-7)
  • (modified) clang/lib/StaticAnalyzer/Core/WorkList.cpp (+5-5)
  • (modified) clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp (+16-15)
  • (modified) clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp (+50-12)
  • (modified) clang/test/Analysis/analyzer-config.c (+1)
  • (added) clang/test/Analysis/csv2json.py (+98)
  • (modified) clang/test/lit.cfg.py (+10)
diff --git a/clang/docs/analyzer/developer-docs.rst b/clang/docs/analyzer/developer-docs.rst
index 60c0e71ad847c..a925cf7ca02e1 100644
--- a/clang/docs/analyzer/developer-docs.rst
+++ b/clang/docs/analyzer/developer-docs.rst
@@ -12,3 +12,4 @@ Contents:
    developer-docs/nullability
    developer-docs/RegionStore
    developer-docs/PerformanceInvestigation
+   developer-docs/Statistics
diff --git a/clang/docs/analyzer/developer-docs/Statistics.rst b/clang/docs/analyzer/developer-docs/Statistics.rst
new file mode 100644
index 0000000000000..d352bb6f01ebc
--- /dev/null
+++ b/clang/docs/analyzer/developer-docs/Statistics.rst
@@ -0,0 +1,21 @@
+======================
+Metrics and Statistics
+======================
+
+TODO: write this once the design is settled (@reviewer, don't look here yet)
+
+CSA enjoys two facilities to collect statistics per translation unit and per entry point.
+
+Mention the following tools:
+- STATISTIC macro
+- ALLWAYS_ENABLED_STATISTIC macro
+
+- STAT_COUNTER macro
+- STAT_MAX macro
+
+- BoolEPStat
+- UnsignedEPStat
+- CounterEPStat
+- UnsignedMaxEPStat
+
+- dump-se-metrics-to-csv="%t.csv"
diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
index 2aa00db411844..b88bce5e262a7 100644
--- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
+++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
@@ -353,6 +353,12 @@ ANALYZER_OPTION(bool, DisplayCTUProgress, "display-ctu-progress",
                 "the analyzer's progress related to ctu.",
                 false)
 
+ANALYZER_OPTION(
+    StringRef, DumpSEStatsToCSV, "dump-se-stats-to-csv",
+    "If provided, the analyzer will dump statistics per entry point "
+    "into the specified CSV file.",
+    "")
+
 ANALYZER_OPTION(bool, ShouldTrackConditions, "track-conditions",
                 "Whether to track conditions that are a control dependency of "
                 "an already tracked variable.",
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
new file mode 100644
index 0000000000000..16c9fdf97fc30
--- /dev/null
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
@@ -0,0 +1,162 @@
+// EntryPointStats.h - Tracking statistics per  entry point -*- C++ -*-//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===---------------------------------------------------------------===//
+
+#ifndef CLANG_INCLUDE_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_ENTRYPOINTSTATS_H
+#define CLANG_INCLUDE_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_ENTRYPOINTSTATS_H
+
+#include "llvm/ADT/Statistic.h"
+#include "llvm/ADT/StringRef.h"
+
+namespace llvm {
+class raw_ostream;
+} // namespace llvm
+
+namespace clang {
+class Decl;
+
+namespace ento {
+
+class EntryPointStat {
+public:
+  llvm::StringLiteral name() const { return Name; }
+
+  static void lockRegistry();
+
+  static void takeSnapshot(const Decl *EntryPoint);
+  static void dumpStatsAsCSV(llvm::raw_ostream &OS);
+  static void dumpStatsAsCSV(llvm::StringRef FileName);
+
+protected:
+  explicit EntryPointStat(llvm::StringLiteral Name) : Name{Name} {}
+  EntryPointStat(const EntryPointStat &) = delete;
+  EntryPointStat(EntryPointStat &&) = delete;
+  EntryPointStat &operator=(EntryPointStat &) = delete;
+  EntryPointStat &operator=(EntryPointStat &&) = delete;
+
+private:
+  llvm::StringLiteral Name;
+};
+
+class BoolEPStat : public EntryPointStat {
+  std::optional<bool> Value = {};
+
+public:
+  explicit BoolEPStat(llvm::StringLiteral Name);
+  unsigned value() const { return Value && *Value; }
+  void set(bool V) {
+    assert(!Value.has_value());
+    Value = V;
+  }
+  void reset() { Value = {}; }
+};
+
+// used by CounterEntryPointTranslationUnitStat
+class CounterEPStat : public EntryPointStat {
+  using EntryPointStat::EntryPointStat;
+  unsigned Value = {};
+
+public:
+  explicit CounterEPStat(llvm::StringLiteral Name);
+  unsigned value() const { return Value; }
+  void reset() { Value = {}; }
+  CounterEPStat &operator++() {
+    ++Value;
+    return *this;
+  }
+
+  CounterEPStat &operator++(int) {
+    // No difference as you can't extract the value
+    return ++(*this);
+  }
+
+  CounterEPStat &operator+=(unsigned Inc) {
+    Value += Inc;
+    return *this;
+  }
+};
+
+// used by UnsignedMaxEtryPointTranslationUnitStatistic
+class UnsignedMaxEPStat : public EntryPointStat {
+  using EntryPointStat::EntryPointStat;
+  unsigned Value = {};
+
+public:
+  explicit UnsignedMaxEPStat(llvm::StringLiteral Name);
+  unsigned value() const { return Value; }
+  void reset() { Value = {}; }
+  void updateMax(unsigned X) { Value = std::max(Value, X); }
+};
+
+class UnsignedEPStat : public EntryPointStat {
+  using EntryPointStat::EntryPointStat;
+  std::optional<unsigned> Value = {};
+
+public:
+  explicit UnsignedEPStat(llvm::StringLiteral Name);
+  unsigned value() const { return Value.value_or(0); }
+  void reset() { Value.reset(); }
+  void set(unsigned V) {
+    assert(!Value.has_value());
+    Value = V;
+  }
+};
+
+class CounterEntryPointTranslationUnitStat {
+  CounterEPStat M;
+  llvm::TrackingStatistic S;
+
+public:
+  CounterEntryPointTranslationUnitStat(const char *DebugType,
+                                       llvm::StringLiteral Name,
+                                       llvm::StringLiteral Desc)
+      : M(Name), S(DebugType, Name.data(), Desc.data()) {}
+  CounterEntryPointTranslationUnitStat &operator++() {
+    ++M;
+    ++S;
+    return *this;
+  }
+
+  CounterEntryPointTranslationUnitStat &operator++(int) {
+    // No difference with prefix as the value is not observable.
+    return ++(*this);
+  }
+
+  CounterEntryPointTranslationUnitStat &operator+=(unsigned Inc) {
+    M += Inc;
+    S += Inc;
+    return *this;
+  }
+};
+
+class UnsignedMaxEtryPointTranslationUnitStatistic {
+  UnsignedMaxEPStat M;
+  llvm::TrackingStatistic S;
+
+public:
+  UnsignedMaxEtryPointTranslationUnitStatistic(const char *DebugType,
+                                               llvm::StringLiteral Name,
+                                               llvm::StringLiteral Desc)
+      : M(Name), S(DebugType, Name.data(), Desc.data()) {}
+  void updateMax(uint64_t Value) {
+    M.updateMax(static_cast<unsigned>(Value));
+    S.updateMax(Value);
+  }
+};
+
+#define STAT_COUNTER(VARNAME, DESC)                                            \
+  static clang::ento::CounterEntryPointTranslationUnitStat VARNAME = {         \
+      DEBUG_TYPE, #VARNAME, DESC}
+
+#define STAT_MAX(VARNAME, DESC)                                                \
+  static clang::ento::UnsignedMaxEtryPointTranslationUnitStatistic VARNAME = { \
+      DEBUG_TYPE, #VARNAME, DESC}
+
+} // namespace ento
+} // namespace clang
+
+#endif // CLANG_INCLUDE_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_ENTRYPOINTSTATS_H
diff --git a/clang/lib/StaticAnalyzer/Checkers/AnalyzerStatsChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/AnalyzerStatsChecker.cpp
index a54f1b1e71d47..d030e69a2a6e0 100644
--- a/clang/lib/StaticAnalyzer/Checkers/AnalyzerStatsChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/AnalyzerStatsChecker.cpp
@@ -13,12 +13,12 @@
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
 #include "clang/StaticAnalyzer/Core/CheckerManager.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/ADT/SmallString.h"
-#include "llvm/ADT/Statistic.h"
 #include "llvm/Support/raw_ostream.h"
 #include <optional>
 
@@ -27,10 +27,9 @@ using namespace ento;
 
 #define DEBUG_TYPE "StatsChecker"
 
-STATISTIC(NumBlocks,
-          "The # of blocks in top level functions");
-STATISTIC(NumBlocksUnreachable,
-          "The # of unreachable blocks in analyzing top level functions");
+STAT_COUNTER(NumBlocks, "The # of blocks in top level functions");
+STAT_COUNTER(NumBlocksUnreachable,
+             "The # of unreachable blocks in analyzing top level functions");
 
 namespace {
 class AnalyzerStatsChecker : public Checker<check::EndAnalysis> {
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
index a4f9e092e8205..5f78fc433275d 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
@@ -39,6 +39,7 @@
 #include "clang/StaticAnalyzer/Core/Checker.h"
 #include "clang/StaticAnalyzer/Core/CheckerManager.h"
 #include "clang/StaticAnalyzer/Core/CheckerRegistryData.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"
@@ -54,7 +55,6 @@
 #include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/SmallVector.h"
-#include "llvm/ADT/Statistic.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/iterator_range.h"
@@ -82,19 +82,19 @@ using namespace llvm;
 
 #define DEBUG_TYPE "BugReporter"
 
-STATISTIC(MaxBugClassSize,
-          "The maximum number of bug reports in the same equivalence class");
-STATISTIC(MaxValidBugClassSize,
-          "The maximum number of bug reports in the same equivalence class "
-          "where at least one report is valid (not suppressed)");
-
-STATISTIC(NumTimesReportPassesZ3, "Number of reports passed Z3");
-STATISTIC(NumTimesReportRefuted, "Number of reports refuted by Z3");
-STATISTIC(NumTimesReportEQClassAborted,
-          "Number of times a report equivalence class was aborted by the Z3 "
-          "oracle heuristic");
-STATISTIC(NumTimesReportEQClassWasExhausted,
-          "Number of times all reports of an equivalence class was refuted");
+STAT_MAX(MaxBugClassSize,
+         "The maximum number of bug reports in the same equivalence class");
+STAT_MAX(MaxValidBugClassSize,
+         "The maximum number of bug reports in the same equivalence class "
+         "where at least one report is valid (not suppressed)");
+
+STAT_COUNTER(NumTimesReportPassesZ3, "Number of reports passed Z3");
+STAT_COUNTER(NumTimesReportRefuted, "Number of reports refuted by Z3");
+STAT_COUNTER(NumTimesReportEQClassAborted,
+             "Number of times a report equivalence class was aborted by the Z3 "
+             "oracle heuristic");
+STAT_COUNTER(NumTimesReportEQClassWasExhausted,
+             "Number of times all reports of an equivalence class was refuted");
 
 BugReporterVisitor::~BugReporterVisitor() = default;
 
diff --git a/clang/lib/StaticAnalyzer/Core/CMakeLists.txt b/clang/lib/StaticAnalyzer/Core/CMakeLists.txt
index fb9394a519eb7..d0a9b202f9c52 100644
--- a/clang/lib/StaticAnalyzer/Core/CMakeLists.txt
+++ b/clang/lib/StaticAnalyzer/Core/CMakeLists.txt
@@ -24,6 +24,7 @@ add_clang_library(clangStaticAnalyzerCore
   CoreEngine.cpp
   DynamicExtent.cpp
   DynamicType.cpp
+  EntryPointStats.cpp
   Environment.cpp
   ExplodedGraph.cpp
   ExprEngine.cpp
diff --git a/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp b/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp
index d96211c3a6635..5c05c9c87f124 100644
--- a/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/CoreEngine.cpp
@@ -22,12 +22,12 @@
 #include "clang/Basic/LLVM.h"
 #include "clang/StaticAnalyzer/Core/AnalyzerOptions.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/BlockCounter.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/FunctionSummary.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/WorkList.h"
 #include "llvm/ADT/STLExtras.h"
-#include "llvm/ADT/Statistic.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/ErrorHandling.h"
 #include "llvm/Support/FormatVariadic.h"
@@ -43,14 +43,12 @@ using namespace ento;
 
 #define DEBUG_TYPE "CoreEngine"
 
-STATISTIC(NumSteps,
-            "The # of steps executed.");
-STATISTIC(NumSTUSteps, "The # of STU steps executed.");
-STATISTIC(NumCTUSteps, "The # of CTU steps executed.");
-STATISTIC(NumReachedMaxSteps,
-            "The # of times we reached the max number of steps.");
-STATISTIC(NumPathsExplored,
-            "The # of paths explored by the analyzer.");
+STAT_COUNTER(NumSteps, "The # of steps executed.");
+STAT_COUNTER(NumSTUSteps, "The # of STU steps executed.");
+STAT_COUNTER(NumCTUSteps, "The # of CTU steps executed.");
+ALWAYS_ENABLED_STATISTIC(NumReachedMaxSteps,
+                         "The # of times we reached the max number of steps.");
+STAT_COUNTER(NumPathsExplored, "The # of paths explored by the analyzer.");
 
 //===----------------------------------------------------------------------===//
 // Core analysis engine.
diff --git a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
new file mode 100644
index 0000000000000..f17d0522f983a
--- /dev/null
+++ b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
@@ -0,0 +1,201 @@
+//===- EntryPointStats.cpp ----------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--------------------------------------------------------------------===//
+
+#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
+#include "clang/AST/DeclBase.h"
+#include "clang/Analysis/AnalysisDeclContext.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringRef.h"
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/ManagedStatic.h"
+#include "llvm/Support/raw_ostream.h"
+#include <iterator>
+
+using namespace clang;
+using namespace ento;
+
+namespace {
+struct Registry {
+  std::vector<BoolEPStat *> BoolStats;
+  std::vector<CounterEPStat *> CounterStats;
+  std::vector<UnsignedMaxEPStat *> UnsignedMaxStats;
+  std::vector<UnsignedEPStat *> UnsignedStats;
+
+  bool IsLocked = false;
+
+  struct Snapshot {
+    const Decl *EntryPoint;
+    std::vector<bool> BoolStatValues;
+    std::vector<unsigned> UnsignedStatValues;
+
+    void dumpDynamicStatsAsCSV(llvm::raw_ostream &OS) const;
+  };
+
+  std::vector<Snapshot> Snapshots;
+};
+} // namespace
+
+static llvm::ManagedStatic<Registry> StatsRegistry;
+
+namespace {
+template <typename Callback> void enumerateStatVectors(const Callback &Fn) {
+  Fn(StatsRegistry->BoolStats);
+  Fn(StatsRegistry->CounterStats);
+  Fn(StatsRegistry->UnsignedMaxStats);
+  Fn(StatsRegistry->UnsignedStats);
+}
+} // namespace
+
+static void checkStatName(const EntryPointStat *M) {
+#ifdef NDEBUG
+  return;
+#endif // NDEBUG
+  constexpr std::array AllowedSpecialChars = {
+      '+', '-', '_', '=', ':', '(',  ')', '@', '!', '~',
+      '$', '%', '^', '&', '*', '\'', ';', '<', '>', '/'};
+  for (unsigned char C : M->name()) {
+    if (!std::isalnum(C) && !llvm::is_contained(AllowedSpecialChars, C)) {
+      llvm::errs() << "Stat name \"" << M->name() << "\" contains character '"
+                   << C << "' (" << static_cast<int>(C)
+                   << ") that is not allowed.";
+      assert(false && "The Stat name contains unallowed character");
+    }
+  }
+}
+
+void EntryPointStat::lockRegistry() {
+  auto CmpByNames = [](const EntryPointStat *L, const EntryPointStat *R) {
+    return L->name() < R->name();
+  };
+  enumerateStatVectors(
+      [CmpByNames](auto &Stats) { llvm::sort(Stats, CmpByNames); });
+  enumerateStatVectors(
+      [](const auto &Stats) { llvm::for_each(Stats, checkStatName); });
+  StatsRegistry->IsLocked = true;
+}
+
+static bool isRegistered(llvm::StringLiteral Name) {
+  auto ByName = [Name](const EntryPointStat *M) { return M->name() == Name; };
+  bool Result = false;
+  enumerateStatVectors([ByName, &Result](const auto &Stats) {
+    Result = Result || llvm::any_of(Stats, ByName);
+  });
+  return Result;
+}
+
+BoolEPStat::BoolEPStat(llvm::StringLiteral Name) : EntryPointStat(Name) {
+  assert(!StatsRegistry->IsLocked);
+  assert(!isRegistered(Name));
+  StatsRegistry->BoolStats.push_back(this);
+}
+
+CounterEPStat::CounterEPStat(llvm::StringLiteral Name) : EntryPointStat(Name) {
+  assert(!StatsRegistry->IsLocked);
+  assert(!isRegistered(Name));
+  StatsRegistry->CounterStats.push_back(this);
+}
+
+UnsignedMaxEPStat::UnsignedMaxEPStat(llvm::StringLiteral Name)
+    : EntryPointStat(Name) {
+  assert(!StatsRegistry->IsLocked);
+  assert(!isRegistered(Name));
+  StatsRegistry->UnsignedMaxStats.push_back(this);
+}
+
+UnsignedEPStat::UnsignedEPStat(llvm::StringLiteral Name)
+    : EntryPointStat(Name) {
+  assert(!StatsRegistry->IsLocked);
+  assert(!isRegistered(Name));
+  StatsRegistry->UnsignedStats.push_back(this);
+}
+
+static std::vector<unsigned> consumeUnsignedStats() {
+  std::vector<unsigned> Result;
+  Result.reserve(StatsRegistry->CounterStats.size() +
+                 StatsRegistry->UnsignedMaxStats.size() +
+                 StatsRegistry->UnsignedStats.size());
+  for (auto *M : StatsRegistry->CounterStats) {
+    Result.push_back(M->value());
+    M->reset();
+  }
+  for (auto *M : StatsRegistry->UnsignedMaxStats) {
+    Result.push_back(M->value());
+    M->reset();
+  }
+  for (auto *M : StatsRegistry->UnsignedStats) {
+    Result.push_back(M->value());
+    M->reset();
+  }
+  return Result;
+}
+
+static std::vector<llvm::StringLiteral> getStatNames() {
+  std::vector<llvm::StringLiteral> Ret;
+  auto GetName = [](const EntryPointStat *M) { return M->name(); };
+  enumerateStatVectors([GetName, &Ret](const auto &Stats) {
+    transform(Stats, std::back_inserter(Ret), GetName);
+  });
+  return Ret;
+}
+
+void Registry::Snapshot::dumpDynamicStatsAsCSV(llvm::raw_ostream &OS) const {
+  OS << '"';
+  llvm::printEscapedString(
+      clang::AnalysisDeclContext::getFunctionName(EntryPoint), OS);
+  OS << "\", ";
+  auto PrintAsBool = [&OS](bool B) { OS << (B ? "true" : "false"); };
+  llvm::interleaveComma(BoolStatValues, OS, PrintAsBool);
+  OS << ((BoolStatValues.empty() || UnsignedStatValues.empty()) ? "" : ", ");
+  llvm::interleaveComma(UnsignedStatValues, OS);
+}
+
+static std::vector<bool> consumeBoolStats() {
+  std::vector<bool> Result;
+  Result.reserve(StatsRegistry->BoolStats.size());
+  for (auto *M : StatsRegistry->BoolStats) {
+    Result.push_back(M->value());
+    M->reset();
+  }
+  return Result;
+}
+
+void EntryPointStat::takeSnapshot(const Decl *EntryPoint) {
+  auto BoolValues = consumeBoolStats();
+  auto UnsignedValues = consumeUnsignedStats();
+  StatsRegistry->Snapshots.push_back(
+      {EntryPoint, std::move(BoolValues), std::move(UnsignedValues)});
+}
+
+void EntryPointStat::dumpStatsAsCSV(llvm::StringRef FileName) {
+  std::error_code EC;
+  llvm::raw_fd_ostream File(FileName, EC, llvm::sys::fs::OF_Text);
+  if (EC)
+    return;
+  dumpStatsAsCSV(File);
+}
+
+void EntryPointStat::dumpStatsAsCSV(llvm::raw_ostream &OS) {
+  OS << "EntryPoint, ";
+  llvm::interleaveComma(getStatNames(), OS);
+  OS << "\n";
+
+  std::vector<std::string> Rows;
+  Rows.reserve(StatsRegistry->Snapshots.size());
+  for (const auto &Snapshot : StatsRegistry->Snapshots) {
+    std::string Row;
+    llvm::raw_string_ostream RowOs(Row);
+    Snapshot.dumpDynamicStatsAsCSV(RowOs);
+    RowOs << "\n";
+    Rows.push_back(RowOs.str());
+  }
+  llvm::sort(Rows);
+  for (const auto &Row : Rows) {
+    OS << Row;
+  }
+}
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index 914eb0f4ef6bd..12a5b248c843f 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -49,6 +49,7 @@
 #include "clang/StaticAnalyzer/Core/PathSensitive/ConstraintManager.h"
 #include "clang/StaticAnalyz...
[truncated]

Copy link

github-actions bot commented Mar 13, 2025

✅ With the latest revision this PR passed the Python code formatter.

@balazs-benics-sonarsource
Copy link
Contributor

Moving out the attribution, to not get spammed by different forks when picking this commit or rebasing.

@balazs-benics-sonarsource started this design and I picked over the baton.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing infrastructural improvements like this; these will be helpful in the analyzer development. Overall I'm satisfied with the idea of this commit; but I marked a few typos and stylistic suggestions in inline comments.

@necto necto requested a review from NagyDonat March 14, 2025 14:23
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only found a couple of minor points. Otherwise looks good.

@necto necto requested a review from steakhal March 15, 2025 05:53
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@steakhal steakhal merged commit 57e3641 into llvm:main Mar 17, 2025
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants