Skip to content

Reapply "[analyzer][NFC] Make RegionStore dumps deterministic" #115884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 12, 2024
Merged

Reapply "[analyzer][NFC] Make RegionStore dumps deterministic" #115884

merged 3 commits into from
Nov 12, 2024

Conversation

steakhal
Copy link
Contributor

This is reapplies #115615 without using tuples. The eager call of getRegion() and getOffset() could cause crashes when the Store had symbolic bindings.

Here I'm fixing the crash by lazily calling those getters.

Also, the tuple version poorly sorted the Clusters. The memory spaces should have come before the regular clusters.
Now, that is also fixed here, demonstrated by the test.

Dump the memory space clusters before the other clusters, in
alphabetical order. Then default bindings over direct bindings, and if
any has symbolic offset, then those should come before the ones with
concrete offsets.
In theory, we should either have a symbolic offset OR concrete offsets,
but never both at the same time.
This test also demonstrates that all sorts of bindings can be printed in
the right order.
Direct, Default, and Symbolic bindings.
Also testing the order of the Clusters. Memory spaces come first in
alphabetical order then the regular clusters.
@steakhal steakhal requested a review from NagyDonat November 12, 2024 15:42
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Nov 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 12, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balazs Benics (steakhal)

Changes

This is reapplies #115615 without using tuples. The eager call of getRegion() and getOffset() could cause crashes when the Store had symbolic bindings.

Here I'm fixing the crash by lazily calling those getters.

Also, the tuple version poorly sorted the Clusters. The memory spaces should have come before the regular clusters.
Now, that is also fixed here, demonstrated by the test.


Full diff: https://github.com/llvm/llvm-project/pull/115884.diff

2 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Core/RegionStore.cpp (+73-13)
  • (added) clang/test/Analysis/store-dump-orders.cpp (+57)
diff --git a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
index 674099dd7e1f0f..6bad9a93a30169 100644
--- a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -67,9 +67,10 @@ class BindingKey {
             isa<ObjCIvarRegion, CXXDerivedObjectRegion>(r)) &&
            "Not a base");
   }
-public:
 
+public:
   bool isDirect() const { return P.getInt() & Direct; }
+  bool isDefault() const { return !isDirect(); }
   bool hasSymbolicOffset() const { return P.getInt() & Symbolic; }
 
   const MemRegion *getRegion() const { return P.getPointer(); }
@@ -232,27 +233,86 @@ class RegionBindingsRef : public llvm::ImmutableMapRef<const MemRegion *,
 
   void printJson(raw_ostream &Out, const char *NL = "\n",
                  unsigned int Space = 0, bool IsDot = false) const {
-    for (iterator I = begin(), E = end(); I != E; ++I) {
-      // TODO: We might need a .printJson for I.getKey() as well.
+    using namespace llvm;
+    DenseMap<const MemRegion *, std::string> StringifyCache;
+    auto ToString = [&StringifyCache](const MemRegion *R) {
+      auto [Place, Inserted] = StringifyCache.try_emplace(R);
+      if (!Inserted)
+        return Place->second;
+      std::string Res;
+      raw_string_ostream OS(Res);
+      OS << R;
+      Place->second = Res;
+      return Res;
+    };
+
+    using Cluster =
+        std::pair<const MemRegion *, ImmutableMap<BindingKey, SVal>>;
+    using Binding = std::pair<BindingKey, SVal>;
+
+    const auto MemSpaceBeforeRegionName = [&ToString](const Cluster *L,
+                                                      const Cluster *R) {
+      if (isa<MemSpaceRegion>(L->first) && !isa<MemSpaceRegion>(R->first))
+        return true;
+      if (!isa<MemSpaceRegion>(L->first) && isa<MemSpaceRegion>(R->first))
+        return false;
+      return ToString(L->first) < ToString(R->first);
+    };
+
+    const auto SymbolicBeforeOffset = [&ToString](const BindingKey &L,
+                                                  const BindingKey &R) {
+      if (L.hasSymbolicOffset() && !R.hasSymbolicOffset())
+        return true;
+      if (!L.hasSymbolicOffset() && R.hasSymbolicOffset())
+        return false;
+      if (L.hasSymbolicOffset() && R.hasSymbolicOffset())
+        return ToString(L.getRegion()) < ToString(R.getRegion());
+      return L.getOffset() < R.getOffset();
+    };
+
+    const auto DefaultBindingBeforeDirectBindings =
+        [&SymbolicBeforeOffset](const Binding *LPtr, const Binding *RPtr) {
+          const BindingKey &L = LPtr->first;
+          const BindingKey &R = RPtr->first;
+          if (L.isDefault() && !R.isDefault())
+            return true;
+          if (!L.isDefault() && R.isDefault())
+            return false;
+          assert(L.isDefault() == R.isDefault());
+          return SymbolicBeforeOffset(L, R);
+        };
+
+    const auto AddrOf = [](const auto &Item) { return &Item; };
+
+    std::vector<const Cluster *> SortedClusters;
+    SortedClusters.reserve(std::distance(begin(), end()));
+    append_range(SortedClusters, map_range(*this, AddrOf));
+    llvm::sort(SortedClusters, MemSpaceBeforeRegionName);
+
+    for (auto [Idx, C] : llvm::enumerate(SortedClusters)) {
+      const auto &[BaseRegion, Bindings] = *C;
       Indent(Out, Space, IsDot)
-          << "{ \"cluster\": \"" << I.getKey() << "\", \"pointer\": \""
-          << (const void *)I.getKey() << "\", \"items\": [" << NL;
+          << "{ \"cluster\": \"" << BaseRegion << "\", \"pointer\": \""
+          << (const void *)BaseRegion << "\", \"items\": [" << NL;
+
+      std::vector<const Binding *> SortedBindings;
+      SortedBindings.reserve(std::distance(Bindings.begin(), Bindings.end()));
+      append_range(SortedBindings, map_range(Bindings, AddrOf));
+      llvm::sort(SortedBindings, DefaultBindingBeforeDirectBindings);
 
       ++Space;
-      const ClusterBindings &CB = I.getData();
-      for (ClusterBindings::iterator CI = CB.begin(), CE = CB.end(); CI != CE;
-           ++CI) {
-        Indent(Out, Space, IsDot) << "{ " << CI.getKey() << ", \"value\": ";
-        CI.getData().printJson(Out, /*AddQuotes=*/true);
+      for (auto [Idx, B] : llvm::enumerate(SortedBindings)) {
+        const auto &[Key, Value] = *B;
+        Indent(Out, Space, IsDot) << "{ " << Key << ", \"value\": ";
+        Value.printJson(Out, /*AddQuotes=*/true);
         Out << " }";
-        if (std::next(CI) != CE)
+        if (Idx != SortedBindings.size() - 1)
           Out << ',';
         Out << NL;
       }
-
       --Space;
       Indent(Out, Space, IsDot) << "]}";
-      if (std::next(I) != E)
+      if (Idx != SortedClusters.size() - 1)
         Out << ',';
       Out << NL;
     }
diff --git a/clang/test/Analysis/store-dump-orders.cpp b/clang/test/Analysis/store-dump-orders.cpp
new file mode 100644
index 00000000000000..ea81fde0935467
--- /dev/null
+++ b/clang/test/Analysis/store-dump-orders.cpp
@@ -0,0 +1,57 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.ExprInspection -verify %s 2>&1 | FileCheck %s
+// expected-no-diagnostics
+
+void clang_analyzer_printState();
+
+struct Member {
+  int large[10];
+};
+Member getMember();
+
+struct Class {
+  Member m;
+  int first;
+  int second;
+  int third;
+};
+
+
+void test_output(int n) {
+  Class objsecond;
+  objsecond.m.large[n] = 20;
+
+  Class objfirst;
+
+  objfirst.m = getMember();
+  objfirst.second = 2;
+  objfirst.third = 3;
+  objfirst.first = 1;
+
+  clang_analyzer_printState();
+  // Default binding is before any direct bindings.
+  // Direct bindings are increasing by offset.
+  // Global memory space clusters come before any other clusters.
+  // Otherwise, Clusters are in alphabetical order.
+
+  // CHECK:       "store": { "pointer": "0x{{[0-9a-f]+}}", "items": [
+  // CHECK-NEXT:    { "cluster": "GlobalInternalSpaceRegion", "pointer": "0x{{[0-9a-f]+}}", "items": [
+  // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "conj_$
+  // CHECK-NEXT:    ]},
+  // CHECK-NEXT:    { "cluster": "GlobalSystemSpaceRegion", "pointer": "0x{{[0-9a-f]+}}", "items": [
+  // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "conj_$
+  // CHECK-NEXT:    ]},
+  // CHECK-NEXT:    { "cluster": "objfirst", "pointer": "0x{{[0-9a-f]+}}", "items": [
+  // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "lazyCompoundVal
+  // CHECK-NEXT:      { "kind": "Direct", "offset": 320, "value": "1 S32b" },
+  // CHECK-NEXT:      { "kind": "Direct", "offset": 352, "value": "2 S32b" },
+  // CHECK-NEXT:      { "kind": "Direct", "offset": 384, "value": "2 S32b" }
+  // CHECK-NEXT:    ]},
+  // CHECK-NEXT:    { "cluster": "objsecond", "pointer": "0x{{[0-9a-f]+}}", "items": [
+  // CHECK-NEXT:      { "kind": "Default", "offset": 0, "value": "0 S32b" },
+  // CHECK-NEXT:      { "kind": "Direct", "offset": null, "value": "20 S32b" }
+  // CHECK-NEXT:    ]}
+  // CHECK-NEXT:  ]},
+
+  (void)objfirst;
+  (void)objsecond;
+}

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and sorry for delaying this commit with the tuple-based suggestion!

@steakhal steakhal merged commit ae7392b into llvm:main Nov 12, 2024
8 checks passed
@steakhal steakhal deleted the bb/region-store-deterministic-dumps branch November 12, 2024 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants