Skip to content

Remove value cache in SCEV comparator. #100721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 30, 2024
Merged

Remove value cache in SCEV comparator. #100721

merged 1 commit into from
Jul 30, 2024

Conversation

jreiffers
Copy link
Member

The cache triggers almost never, and seems unlikely to help with performance. However, when it does, it is likely to cause the comparator to become inconsistent due to a bad interaction of the depth limit and cache hits. This leads to crashes in debug builds. See the new unit test for a reproducer.

@jreiffers jreiffers requested a review from d0k July 26, 2024 09:17
@jreiffers jreiffers requested a review from nikic as a code owner July 26, 2024 09:17
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 26, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 26, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Johannes Reifferscheid (jreiffers)

Changes

The cache triggers almost never, and seems unlikely to help with performance. However, when it does, it is likely to cause the comparator to become inconsistent due to a bad interaction of the depth limit and cache hits. This leads to crashes in debug builds. See the new unit test for a reproducer.


Full diff: https://github.com/llvm/llvm-project/pull/100721.diff

2 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+9-15)
  • (modified) llvm/unittests/Analysis/ScalarEvolutionTest.cpp (+36)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 51cffac808768..a69fe43b848b8 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -598,10 +598,9 @@ void SCEVUnknown::allUsesReplacedWith(Value *New) {
 /// Since we do not continue running this routine on expression trees once we
 /// have seen unequal values, there is no need to track them in the cache.
 static int
-CompareValueComplexity(EquivalenceClasses<const Value *> &EqCacheValue,
-                       const LoopInfo *const LI, Value *LV, Value *RV,
+CompareValueComplexity(const LoopInfo *const LI, Value *LV, Value *RV,
                        unsigned Depth) {
-  if (Depth > MaxValueCompareDepth || EqCacheValue.isEquivalent(LV, RV))
+  if (Depth > MaxValueCompareDepth)
     return 0;
 
   // Order pointer values after integer values. This helps SCEVExpander form
@@ -660,15 +659,13 @@ CompareValueComplexity(EquivalenceClasses<const Value *> &EqCacheValue,
       return (int)LNumOps - (int)RNumOps;
 
     for (unsigned Idx : seq(LNumOps)) {
-      int Result =
-          CompareValueComplexity(EqCacheValue, LI, LInst->getOperand(Idx),
-                                 RInst->getOperand(Idx), Depth + 1);
+      int Result = CompareValueComplexity(LI, LInst->getOperand(Idx),
+                                          RInst->getOperand(Idx), Depth + 1);
       if (Result != 0)
         return Result;
     }
   }
 
-  EqCacheValue.unionSets(LV, RV);
   return 0;
 }
 
@@ -679,7 +676,6 @@ CompareValueComplexity(EquivalenceClasses<const Value *> &EqCacheValue,
 // not know if they are equivalent for sure.
 static std::optional<int>
 CompareSCEVComplexity(EquivalenceClasses<const SCEV *> &EqCacheSCEV,
-                      EquivalenceClasses<const Value *> &EqCacheValue,
                       const LoopInfo *const LI, const SCEV *LHS,
                       const SCEV *RHS, DominatorTree &DT, unsigned Depth = 0) {
   // Fast-path: SCEVs are uniqued so we can do a quick equality check.
@@ -705,8 +701,8 @@ CompareSCEVComplexity(EquivalenceClasses<const SCEV *> &EqCacheSCEV,
     const SCEVUnknown *LU = cast<SCEVUnknown>(LHS);
     const SCEVUnknown *RU = cast<SCEVUnknown>(RHS);
 
-    int X = CompareValueComplexity(EqCacheValue, LI, LU->getValue(),
-                                   RU->getValue(), Depth + 1);
+    int X = CompareValueComplexity(LI, LU->getValue(), RU->getValue(),
+                                   Depth + 1);
     if (X == 0)
       EqCacheSCEV.unionSets(LHS, RHS);
     return X;
@@ -773,8 +769,8 @@ CompareSCEVComplexity(EquivalenceClasses<const SCEV *> &EqCacheSCEV,
       return (int)LNumOps - (int)RNumOps;
 
     for (unsigned i = 0; i != LNumOps; ++i) {
-      auto X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI, LOps[i],
-                                     ROps[i], DT, Depth + 1);
+      auto X = CompareSCEVComplexity(EqCacheSCEV, LI, LOps[i], ROps[i], DT,
+                                     Depth + 1);
       if (X != 0)
         return X;
     }
@@ -802,12 +798,10 @@ static void GroupByComplexity(SmallVectorImpl<const SCEV *> &Ops,
   if (Ops.size() < 2) return;  // Noop
 
   EquivalenceClasses<const SCEV *> EqCacheSCEV;
-  EquivalenceClasses<const Value *> EqCacheValue;
 
   // Whether LHS has provably less complexity than RHS.
   auto IsLessComplex = [&](const SCEV *LHS, const SCEV *RHS) {
-    auto Complexity =
-        CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI, LHS, RHS, DT);
+    auto Complexity = CompareSCEVComplexity(EqCacheSCEV, LI, LHS, RHS, DT);
     return Complexity && *Complexity < 0;
   };
   if (Ops.size() == 2) {
diff --git a/llvm/unittests/Analysis/ScalarEvolutionTest.cpp b/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
index a6a5ffda3cb70..b923c920b4d75 100644
--- a/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
+++ b/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
@@ -1625,4 +1625,40 @@ TEST_F(ScalarEvolutionsTest, ForgetValueWithOverflowInst) {
   });
 }
 
+EST_F(ScalarEvolutionsTest, ComplexityComparatorIsStrictWeakOrdering) {
+  // Regression test for a case where caching of equivalent values caused the
+  // comparator to get inconsistent.
+  LLVMContext C;
+  SMDiagnostic Err;
+  std::unique_ptr<Module> M = parseAssemblyString(R"(
+    define i32 @foo(i32 %arg0) {
+      %1 = add i32 %arg0, 1
+      %2 = add i32 %arg0, 1
+      %3 = xor i32 %2, %1
+      %4 = add i32 %3, %2
+      %5 = add i32 %arg0, 1
+      %6 = xor i32 %5, %arg0
+      %7 = add i32 %arg0, %6
+      %8 = add i32 %5, %7
+      %9 = xor i32 %8, %7
+      %10 = add i32 %9, %8
+      %11 = xor i32 %10, %9
+      %12 = add i32 %11, %10
+      %13 = xor i32 %12, %11
+      %14 = add i32 %12, %13
+      %15 = add i32 %14, %4
+      ret i32 %15
+    })",
+                                                  Err, C);
+
+  ASSERT_TRUE(M && "Could not parse module?");
+  ASSERT_TRUE(!verifyModule(*M) && "Must have been well formed!");
+
+  runWithSE(*M, "foo", [](Function &F, LoopInfo &LI, ScalarEvolution &SE) {
+    // When _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_DEBUG, this will
+    // crash if the comparator has the specific caching bug.
+    SE.getSCEV(F.getEntryBlock().getTerminator()->getOperand(0));
+  });
+}
+
 }  // end namespace llvm

Copy link

github-actions bot commented Jul 26, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

The cache triggers almost never. When it does, it is likely to cause the
comparator to become inconsistent due to a bad interaction of the depth
limit and cache hits. This leads to crashes in debug builds.
nikic added a commit to nikic/llvm-project that referenced this pull request Jul 29, 2024
…ached

Similarly to CompareSCEVComplexity() itself, we should return
nullopt instead of 0 from CompareValueComplexity(), because we
do not know how the values related to each other. This means that
the values will not be added to the equality cache, and it can
not become inconsistent due to different results at different
depths.

The test case is adopted from
llvm#100721.
@nikic
Copy link
Contributor

nikic commented Jul 29, 2024

I've been looking into this, and think that this is not quite the right fix for the issue. I've put up an alternative at #101022. CompareSCEVComplexity was already returning nullopt when hitting the depth limit, but CompareValueComplexity wasn't. I think it we just drop the EqCache at the Value level, we might still run into a similar issue with the higher level cache at the SCEV level.

I do think that removing the cache is still a good idea independent of the strict weak ordering issue though. Based on my measurements, removing it improves compile-time. And given the very low depth limit involved here, I don't think we need to worry about regressing any degenerate cases either.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let's go ahead and merge this one first. The situation in CompareValueComplexity is pretty silly right now because MaxValueCompareDepth is much lower than MaxSCEVCompareDepth, but they reuse the same Depth parameter, which means that we're almost always going to hit the depth limit in this function. The implementation here warrants a larger change, so let's just drop the cache for now...

@nikic nikic merged commit 65697b1 into llvm:main Jul 30, 2024
7 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 30, 2024

LLVM Buildbot has detected a new failure on builder lldb-aarch64-ubuntu running on linaro-lldb-aarch64-ubuntu while building llvm at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/2464

Here is the relevant piece of the build log for the reference:

Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyBreakpoints.py (616 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalBreak.py (617 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentNWatchNBreak.py (618 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalDelayBreak.py (619 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalDelayWatch.py (620 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalNWatchNBreak.py (621 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManySignals.py (622 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalWatch.py (623 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentManyWatchpoints.py (624 of 1995)
PASS: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentTwoBreakpointThreads.py (625 of 1995)
FAIL: lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalWatchBreak.py (626 of 1995)
******************** TEST 'lldb-api :: functionalities/thread/concurrent_events/TestConcurrentSignalWatchBreak.py' FAILED ********************
Script:
--
/usr/bin/python3.8 /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env ARCHIVER=/usr/local/bin/llvm-ar --env OBJCOPY=/usr/bin/llvm-objcopy --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --arch aarch64 --build-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/dsymutil --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/thread/concurrent_events -p TestConcurrentSignalWatchBreak.py
--
Exit Code: 1

Command Output (stdout):
--
lldb version 20.0.0git (https://github.com/llvm/llvm-project.git revision 65697b1c7cc9824d51f22f109c6a32428a7dd557)
  clang revision 65697b1c7cc9824d51f22f109c6a32428a7dd557
  llvm revision 65697b1c7cc9824d51f22f109c6a32428a7dd557
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

Watchpoint 1 hit:
old value: 0
new value: 1

--
Command Output (stderr):
--
FAIL: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test (TestConcurrentSignalWatchBreak.ConcurrentSignalWatchBreak)
======================================================================
FAIL: test (TestConcurrentSignalWatchBreak.ConcurrentSignalWatchBreak)
   Test a signal/watchpoint/breakpoint in multiple threads.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py", line 148, in wrapper
    return func(*args, **kwargs)
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/functionalities/thread/concurrent_events/TestConcurrentSignalWatchBreak.py", line 15, in test
    self.do_thread_actions(
  File "/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/packages/Python/lldbsuite/test/concurrent_base.py", line 329, in do_thread_actions
    self.assertEqual(
AssertionError: 1 != 2 : Expected 1 stops due to signal delivery, but got 2
Config=aarch64-/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang
----------------------------------------------------------------------
Ran 1 test in 0.725s


aeubanks added a commit to aeubanks/llvm-project that referenced this pull request Mar 27, 2025
This was added in https://reviews.llvm.org/D26389 to help with extremely deep SCEV expressions.

However, this is wrong since we may cache sub-SCEVs to be equivalent that CompareValueComplexity returned 0 due to hitting the max comparison depth.

This doesn't fully fix the issue, since we may still consider deeply
nested SCEVs that only differ in the innermost expression as equivalent
due to hitting depth limits, but this helps with part of the problem.

This also improves compile time in some compiles:
https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u

Similar to llvm#100721.

Partially fixes llvm#130688 and a miscompile we're seeing.
aeubanks added a commit that referenced this pull request Apr 1, 2025
This was added in https://reviews.llvm.org/D26389 to help with extremely
deep SCEV expressions.

However, this is wrong since we may cache sub-SCEVs to be equivalent
that CompareValueComplexity returned 0 due to hitting the max comparison
depth.

This also improves compile time in some compiles:

https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u

Similar to #100721.

Fixes #130688.
Ankur-0429 pushed a commit to Ankur-0429/llvm-project that referenced this pull request Apr 2, 2025
This was added in https://reviews.llvm.org/D26389 to help with extremely
deep SCEV expressions.

However, this is wrong since we may cache sub-SCEVs to be equivalent
that CompareValueComplexity returned 0 due to hitting the max comparison
depth.

This also improves compile time in some compiles:

https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u

Similar to llvm#100721.

Fixes llvm#130688.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants