-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SCEV] Return nullopt from CompareValueComplexity() if depth limit reached #101022
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ached Similarly to CompareSCEVComplexity() itself, we should return nullopt instead of 0 from CompareValueComplexity(), because we do not know how the values related to each other. This means that the values will not be added to the equality cache, and it can not become inconsistent due to different results at different depths. The test case is adopted from llvm#100721.
@llvm/pr-subscribers-llvm-analysis Author: Nikita Popov (nikic) ChangesSimilarly to CompareSCEVComplexity() itself, we should return nullopt instead of 0 from CompareValueComplexity(), because we do not know how the values relate to each other. This means that the values will not be added to the equality cache, and it can not become inconsistent due to different results at different depths. The test case is adopted from #100721. This also has a pretty large positive compile-time impact in some cases (see http://llvm-compile-time-tracker.com/compare.php?from=73c72f2c6505d5bc8b47bb0420f6cba5b24270fe&to=02cb332824f53cf650558de2f88b5ba81aa799d8&stat=instructions:u). This is because we early exit complexity comparison and skip the EquivalenceClasses cache. The LTO improvement on lencod is legitimate -- that benchmark currently spends 10% of total time in SCEV complexity grouping. (And the thing all these SCEVs are computed for is quite silly ... I'll address that separately.) Full diff: https://github.com/llvm/llvm-project/pull/101022.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 51cffac808768..2fdfe243e2dc5 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -597,11 +597,14 @@ void SCEVUnknown::allUsesReplacedWith(Value *New) {
///
/// Since we do not continue running this routine on expression trees once we
/// have seen unequal values, there is no need to track them in the cache.
-static int
+static std::optional<int>
CompareValueComplexity(EquivalenceClasses<const Value *> &EqCacheValue,
const LoopInfo *const LI, Value *LV, Value *RV,
unsigned Depth) {
- if (Depth > MaxValueCompareDepth || EqCacheValue.isEquivalent(LV, RV))
+ if (Depth > MaxValueCompareDepth)
+ return std::nullopt;
+
+ if (EqCacheValue.isEquivalent(LV, RV))
return 0;
// Order pointer values after integer values. This helps SCEVExpander form
@@ -660,7 +663,7 @@ CompareValueComplexity(EquivalenceClasses<const Value *> &EqCacheValue,
return (int)LNumOps - (int)RNumOps;
for (unsigned Idx : seq(LNumOps)) {
- int Result =
+ std::optional<int> Result =
CompareValueComplexity(EqCacheValue, LI, LInst->getOperand(Idx),
RInst->getOperand(Idx), Depth + 1);
if (Result != 0)
@@ -705,8 +708,8 @@ CompareSCEVComplexity(EquivalenceClasses<const SCEV *> &EqCacheSCEV,
const SCEVUnknown *LU = cast<SCEVUnknown>(LHS);
const SCEVUnknown *RU = cast<SCEVUnknown>(RHS);
- int X = CompareValueComplexity(EqCacheValue, LI, LU->getValue(),
- RU->getValue(), Depth + 1);
+ std::optional<int> X = CompareValueComplexity(
+ EqCacheValue, LI, LU->getValue(), RU->getValue(), Depth + 1);
if (X == 0)
EqCacheSCEV.unionSets(LHS, RHS);
return X;
diff --git a/llvm/unittests/Analysis/ScalarEvolutionTest.cpp b/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
index a6a5ffda3cb70..76e6095636305 100644
--- a/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
+++ b/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
@@ -1625,4 +1625,40 @@ TEST_F(ScalarEvolutionsTest, ForgetValueWithOverflowInst) {
});
}
+TEST_F(ScalarEvolutionsTest, ComplexityComparatorIsStrictWeakOrdering) {
+ // Regression test for a case where caching of equivalent values caused the
+ // comparator to get inconsistent.
+ LLVMContext C;
+ SMDiagnostic Err;
+ std::unique_ptr<Module> M = parseAssemblyString(R"(
+ define i32 @foo(i32 %arg0) {
+ %1 = add i32 %arg0, 1
+ %2 = add i32 %arg0, 1
+ %3 = xor i32 %2, %1
+ %4 = add i32 %3, %2
+ %5 = add i32 %arg0, 1
+ %6 = xor i32 %5, %arg0
+ %7 = add i32 %arg0, %6
+ %8 = add i32 %5, %7
+ %9 = xor i32 %8, %7
+ %10 = add i32 %9, %8
+ %11 = xor i32 %10, %9
+ %12 = add i32 %11, %10
+ %13 = xor i32 %12, %11
+ %14 = add i32 %12, %13
+ %15 = add i32 %14, %4
+ ret i32 %15
+ })",
+ Err, C);
+
+ ASSERT_TRUE(M && "Could not parse module?");
+ ASSERT_TRUE(!verifyModule(*M) && "Must have been well formed!");
+
+ runWithSE(*M, "foo", [](Function &F, LoopInfo &LI, ScalarEvolution &SE) {
+ // When _LIBCPP_HARDENING_MODE == _LIBCPP_HARDENING_MODE_DEBUG, this will
+ // crash if the comparator has the specific caching bug.
+ SE.getSCEV(F.getEntryBlock().getTerminator()->getOperand(0));
+ });
+}
+
} // end namespace llvm
|
if (Depth > MaxValueCompareDepth) | ||
return std::nullopt; | ||
|
||
if (EqCacheValue.isEquivalent(LV, RV)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes the bug with inconsistent results, but did you check if the cache actually does anything useful in the average case? In the (admittedly limited) examples I looked at, it was almost pure overhead (i.e., almost no hits, but lots of writes and checks).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think that we should also drop the cache, see #100721 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, missed that. SGTM.
Similarly to CompareSCEVComplexity() itself, we should return nullopt instead of 0 from CompareValueComplexity(), because we do not know how the values relate to each other. This means that the values will not be added to the equality cache, and it can not become inconsistent due to different results at different depths.
The test case is adopted from #100721.
This also has a pretty large positive compile-time impact in some cases (see http://llvm-compile-time-tracker.com/compare.php?from=73c72f2c6505d5bc8b47bb0420f6cba5b24270fe&to=02cb332824f53cf650558de2f88b5ba81aa799d8&stat=instructions:u). This is because we early exit complexity comparison and skip the EquivalenceClasses cache. The LTO improvement on lencod is legitimate -- that benchmark currently spends 10% of total time in SCEV complexity grouping. (And the thing all these SCEVs are computed for is quite silly ... I'll address that separately.)