[analyzer] EvalBinOpLL should return Unknown less often #114222

steakhal · 2024-10-30T12:59:17Z

SValBuilder::getKnownValue, getMinValue, getMaxValue use SValBuilder::simplifySVal.

simplifySVal does repeated simplification until a fixed-point is reached. A single step is done by SimpleSValBuilder::simplifySValOnce, using a Simplifier visitor. That will basically decompose SymSymExprs, and apply constant folding using the constraints we have in the State. Once it decomposes a SymSymExpr, it simplifies both sides and then uses the SValBuilder::evalBinOp to reconstruct the same - but now simpler - SymSymExpr, while applying some caching to remain performant.

This decomposition, and then the subsequent re-composition poses new challenges to the SValBuilder::evalBinOp, which is built to handle expressions coming from real C/C++ code, thus applying some implicit assumptions.

One previous assumption was that nobody would form an expression like "((int*)0) - q" (where q is an int pointer), because it doesn't really makes sense to write code like that.

However, during simplification, we may end up with a call to evalBinOp similar to this.

To me, simplifying a SymbolRef should never result in Unknown or Undef, unless it was Unknown or Undef initially or, during simplification we realized that it's a division by zero once we did the constant folding, etc.

In the following case the simplified SVal should not become UnknownVal:

void top(char *p, char *q) {
  int diff = p - q; // diff: reg<p> - reg<q>
  if (!p) // p: NULL
    simplify(diff); // diff after simplification should be: 0(loc) - reg<q>
}

Returning Unknown from the simplifySVal can weaken analysis precision in other places too, such as in SValBuilder::getKnownValue, getMinValue, or getMaxValue because we call simplifySVal before doing anything else.

For nonloc::SymbolVals, this loss of precision is critical, because for those the SymbolRef carries an accurate type of the encoded computation, thus we should at least have a conservative upper or lower bound that we could return from getMinValue or getMaxValue - yet we would just return nullptr.

const llvm::APSInt *SimpleSValBuilder::getKnownValue(ProgramStateRef state,
                                                      SVal V) {
  return getConstValue(state, simplifySVal(state, V));
}

const llvm::APSInt *SimpleSValBuilder::getMinValue(ProgramStateRef state,
                                                    SVal V) {
  V = simplifySVal(state, V);

  if (const llvm::APSInt *Res = getConcreteValue(V))
    return Res;

  if (SymbolRef Sym = V.getAsSymbol())
    return state->getConstraintManager().getSymMinVal(state, Sym);

  return nullptr;
}

For now, I don't plan to make the simplification bullet-proof, I'm just explaining why I made this change and what you need to look out for in the future if you see a similar issue.

CPP-5750

SValBuilder::getKnownValue, getMinValue, getMaxValue use SValBuilder::simplifySVal. simplifySVal does repeated simplification until a fixed-point is reached. A single step is done by SimpleSValBuilder::simplifySValOnce, using a Simplifier visitor. That will basically decompose SymSymExprs, and apply constant folding using the constraints we have in the State. Once it decomposes a SymSymExpr, it simplifies both sides and then uses the SValBuilder::evalBinOp to reconstruct the same - but now simpler - SymSymExpr, while applying some caching to remain performant. This decomposition, and then the subsequent re-composition poses new challenges to the SValBuilder::evalBinOp, which is built to handle expressions coming from real C/C++ code, thus applying some implicit assumptions. One previous assumption was that nobody would form an expression like "((int*)0) - q" (where q is an int pointer), because it doesn't really makes sense to write code like that. However, during simplification, we may end up with a call to evalBinOp similar to this. To me, simplifying a SymbolRef should never result in Unknown or Undef, unless it was Unknown or Undef initially or, during simplification we realized that it's a division by zero once we did the constant folding, etc. In the following case the simplified SVal should not become UnknownVal: ```c++ void top(char *p, char *q) { int diff = p - q; // diff: reg<p> - reg<q> if (!p) // p: NULL simplify(diff); // diff after simplification should be: 0(loc) - reg<q> } ``` Returning Unknown from the simplifySVal can weaken analysis precision in other places too, such as in SValBuilder::getKnownValue, getMinValue, or getMaxValue because we call simplifySVal before doing anything else. For nonloc::SymbolVals, this loss of precision is critical, because for those the SymbolRef carries an accurate type of the encoded computation, thus we should at least have a conservative upper or lower bound that we could return from getMinValue or getMaxValue - yet we would just return nullptr. ```c++ const llvm::APSInt *SimpleSValBuilder::getKnownValue(ProgramStateRef state, SVal V) { return getConstValue(state, simplifySVal(state, V)); } const llvm::APSInt *SimpleSValBuilder::getMinValue(ProgramStateRef state, SVal V) { V = simplifySVal(state, V); if (const llvm::APSInt *Res = getConcreteValue(V)) return Res; if (SymbolRef Sym = V.getAsSymbol()) return state->getConstraintManager().getSymMinVal(state, Sym); return nullptr; } ``` For now, I don't plan to make the simplification bullet-proof, I'm just explaining why I made this change and what you need to look out for in the future if you see a similar issue. CPP-5750

steakhal · 2024-10-30T12:59:34Z

@necto

llvmbot · 2024-10-30T12:59:52Z

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balazs Benics (steakhal)

Changes

SValBuilder::getKnownValue, getMinValue, getMaxValue use SValBuilder::simplifySVal.

simplifySVal does repeated simplification until a fixed-point is reached. A single step is done by SimpleSValBuilder::simplifySValOnce, using a Simplifier visitor. That will basically decompose SymSymExprs, and apply constant folding using the constraints we have in the State. Once it decomposes a SymSymExpr, it simplifies both sides and then uses the SValBuilder::evalBinOp to reconstruct the same - but now simpler - SymSymExpr, while applying some caching to remain performant.

This decomposition, and then the subsequent re-composition poses new challenges to the SValBuilder::evalBinOp, which is built to handle expressions coming from real C/C++ code, thus applying some implicit assumptions.

One previous assumption was that nobody would form an expression like "((int*)0) - q" (where q is an int pointer), because it doesn't really makes sense to write code like that.

However, during simplification, we may end up with a call to evalBinOp similar to this.

To me, simplifying a SymbolRef should never result in Unknown or Undef, unless it was Unknown or Undef initially or, during simplification we realized that it's a division by zero once we did the constant folding, etc.

In the following case the simplified SVal should not become UnknownVal:

void top(char *p, char *q) {
  int diff = p - q; // diff: reg&lt;p&gt; - reg&lt;q&gt;
  if (!p) // p: NULL
    simplify(diff); // diff after simplification should be: 0(loc) - reg&lt;q&gt;
}

Returning Unknown from the simplifySVal can weaken analysis precision in other places too, such as in SValBuilder::getKnownValue, getMinValue, or getMaxValue because we call simplifySVal before doing anything else.

For nonloc::SymbolVals, this loss of precision is critical, because for those the SymbolRef carries an accurate type of the encoded computation, thus we should at least have a conservative upper or lower bound that we could return from getMinValue or getMaxValue - yet we would just return nullptr.

const llvm::APSInt *SimpleSValBuilder::getKnownValue(ProgramStateRef state,
                                                      SVal V) {
  return getConstValue(state, simplifySVal(state, V));
}

const llvm::APSInt *SimpleSValBuilder::getMinValue(ProgramStateRef state,
                                                    SVal V) {
  V = simplifySVal(state, V);

  if (const llvm::APSInt *Res = getConcreteValue(V))
    return Res;

  if (SymbolRef Sym = V.getAsSymbol())
    return state-&gt;getConstraintManager().getSymMinVal(state, Sym);

  return nullptr;
}

For now, I don't plan to make the simplification bullet-proof, I'm just explaining why I made this change and what you need to look out for in the future if you see a similar issue.

CPP-5750

Full diff: https://github.com/llvm/llvm-project/pull/114222.diff

3 Files Affected:

(modified) clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp (+4-3)
(modified) clang/unittests/StaticAnalyzer/CMakeLists.txt (+1)
(added) clang/unittests/StaticAnalyzer/SValSimplifyerTest.cpp (+103)

diff --git a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
index 45e48d435aca6a..229169f848e228 100644
--- a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
+++ b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
@@ -860,11 +860,12 @@ SVal SimpleSValBuilder::evalBinOpLL(ProgramStateRef state,
     // If one of the operands is a symbol and the other is a constant,
     // build an expression for use by the constraint manager.
     if (SymbolRef rSym = rhs.getAsLocSymbol()) {
-      // We can only build expressions with symbols on the left,
-      // so we need a reversible operator.
-      if (!BinaryOperator::isComparisonOp(op) || op == BO_Cmp)
+      if (op == BO_Cmp)
         return UnknownVal();
 
+      if (!BinaryOperator::isComparisonOp(op))
+        return makeNonLoc(L.getValue(), op, rSym, resultTy);
+
       op = BinaryOperator::reverseComparisonOp(op);
       return makeNonLoc(rSym, op, L.getValue(), resultTy);
     }
diff --git a/clang/unittests/StaticAnalyzer/CMakeLists.txt b/clang/unittests/StaticAnalyzer/CMakeLists.txt
index 5ef72cfaea4011..f5da86e5456030 100644
--- a/clang/unittests/StaticAnalyzer/CMakeLists.txt
+++ b/clang/unittests/StaticAnalyzer/CMakeLists.txt
@@ -20,6 +20,7 @@ add_clang_unittest(StaticAnalysisTests
   RegisterCustomCheckersTest.cpp
   StoreTest.cpp
   SymbolReaperTest.cpp
+  SValSimplifyerTest.cpp
   SValTest.cpp
   TestReturnValueUnderConstruction.cpp
   Z3CrosscheckOracleTest.cpp
diff --git a/clang/unittests/StaticAnalyzer/SValSimplifyerTest.cpp b/clang/unittests/StaticAnalyzer/SValSimplifyerTest.cpp
new file mode 100644
index 00000000000000..b3feb0e4cce231
--- /dev/null
+++ b/clang/unittests/StaticAnalyzer/SValSimplifyerTest.cpp
@@ -0,0 +1,103 @@
+//===- unittests/StaticAnalyzer/SValSimplifyerTest.cpp --------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "CheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
+#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
+#include "clang/StaticAnalyzer/Core/Checker.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Frontend/AnalysisConsumer.h"
+#include "clang/StaticAnalyzer/Frontend/CheckerRegistry.h"
+#include "llvm/ADT/Twine.h"
+#include "llvm/Support/raw_ostream.h"
+#include "gtest/gtest.h"
+
+using namespace clang;
+using namespace ento;
+
+static std::string toString(SVal V) {
+  std::string Result;
+  llvm::raw_string_ostream Stream(Result);
+  V.dumpToStream(Stream);
+  return Result;
+}
+
+static void replace(std::string &Content, StringRef Substr,
+                    StringRef Replacement) {
+  std::size_t Pos = 0;
+  while ((Pos = Content.find(Substr, Pos)) != std::string::npos) {
+    Content.replace(Pos, Substr.size(), Replacement);
+    Pos += Replacement.size();
+  }
+}
+
+namespace {
+
+class SimplifyChecker : public Checker<check::PreCall> {
+  const BugType Bug{this, "SimplifyChecker"};
+  const CallDescription SimplifyCall{CDM::SimpleFunc, {"simplify"}, 1};
+
+  void report(CheckerContext &C, const Expr *E, StringRef Description) const {
+    PathDiagnosticLocation Loc(E->getExprLoc(), C.getSourceManager(), E);
+    auto Report = std::make_unique<BasicBugReport>(Bug, Description, Loc);
+    C.emitReport(std::move(Report));
+  }
+
+public:
+  void checkPreCall(const CallEvent &Call, CheckerContext &C) const {
+    if (!SimplifyCall.matches(Call))
+      return;
+    const Expr *Arg = Call.getArgExpr(0);
+    SVal Val = C.getSVal(Arg);
+    SVal SimplifiedVal = C.getSValBuilder().simplifySVal(C.getState(), Val);
+    std::string Subject = toString(Val);
+    std::string Simplified = toString(SimplifiedVal);
+    std::string Message = (llvm::Twine{Subject} + " -> " + Simplified).str();
+    report(C, Arg, Message);
+  }
+};
+} // namespace
+
+static void addSimplifyChecker(AnalysisASTConsumer &AnalysisConsumer,
+                               AnalyzerOptions &AnOpts) {
+  AnOpts.CheckersAndPackages = {{"SimplifyChecker", true}};
+  AnalysisConsumer.AddCheckerRegistrationFn([](CheckerRegistry &Registry) {
+    Registry.addChecker<SimplifyChecker>("SimplifyChecker", "EmptyDescription",
+                                         "EmptyDocsUri");
+  });
+}
+
+static void runThisCheckerOnCode(const std::string &Code, std::string &Diags) {
+  ASSERT_TRUE(runCheckerOnCode<addSimplifyChecker>(Code, Diags,
+                                                   /*OnlyEmitWarnings=*/true));
+  ASSERT_FALSE(Diags.empty());
+  ASSERT_EQ(Diags.back(), '\n');
+  Diags.pop_back();
+}
+
+namespace {
+
+TEST(SValSimplifyerTest, LHSConstrainedNullPtrDiff) {
+  constexpr auto Code = R"cpp(
+template <class T> void simplify(T);
+void LHSConstrainedNullPtrDiff(char *p, char *q) {
+  int diff = p - q;
+  if (!p)
+    simplify(diff);
+})cpp";
+
+  std::string Diags;
+  runThisCheckerOnCode(Code, Diags);
+  replace(Diags, "(reg_$0<char * p>)", "reg_p");
+  replace(Diags, "(reg_$1<char * q>)", "reg_q");
+  // This should not be simplified to "Unknown".
+  EXPECT_EQ(Diags, "SimplifyChecker: reg_p - reg_q -> 0U - reg_q");
+}
+
+} // namespace

Xazax-hun

Makes sense for me. It would be nice if we had a minimal reproducer for a regression test instead of maintaining a large-ish test.

Xazax-hun · 2024-10-30T13:16:58Z

To me, simplifying a SymbolRef should never result in Unknown or Undef, unless it was Unknown or Undef initially or, during simplification we realized that it's a division by zero once we did the constant folding, etc.

I understand that we might not be ready for this, but feels like at some point we should have this as a form of an assertion. I am not sure about the Undef part because there are many ways to get to undef during constant folding (signed overflows, division by zero, OOB access of an array, defer of an invalid pointer). Even if some of those cannot happen during the current simplification they might be modeled later. So, I think a simpler "the result can only be Unknown if the input was also Unknown" would make a lot of sense to me.

steakhal · 2024-10-30T13:31:17Z

Makes sense for me. It would be nice if we had a minimal reproducer for a regression test instead of maintaining a large-ish test.

I wish I could add a LIT test instead.
Out of the get min/max val, or getKnownValue APIs, only the latter is used slightly more broadly, where the callsites can't assume it should return a valid result. Still, these APIs are barely used.
In my case I have additional context in my downstream checker, where I can make this assumption and I was surprised to see crashes. So I had no other options but to craft a unittest for this.

NagyDonat · 2024-10-31T10:06:15Z

Offtopic @Xazax-hun

there are many ways to get to undef during constant folding (signed overflows ...

I was surprised to read this because as far as I know (knew) the analyzer models signed operations as if overflow was completely natural for them. (See e.g. the method assumeInBound which IIRC relies on this to convert a "within this interval" check to a ">= some threshold" check. I also vaguely recall that there are many FPs where the analyzer assumes wraparond of signed values.)

I would be very grateful if somebody pointed out some logic in the analyzer source which ensures that a signed overflow produces an UndefinedVal (or the overflow is disallowed by some other mechanism). If I was right and we don't have such code yet, then it might be good to add some...

steakhal · 2024-10-31T10:28:01Z

Offtopic @Xazax-hun

there are many ways to get to undef during constant folding (signed overflows ...

I was surprised to read this because as far as I know (knew) the analyzer models signed operations as if overflow was completely natural for them. (See e.g. the method assumeInBound which IIRC relies on this to convert a "within this interval" check to a ">= some threshold" check. I also vaguely recall that there are many FPs where the analyzer assumes wraparond of signed values.)

I would be very grateful if somebody pointed out some logic in the analyzer source which ensures that a signed overflow produces an UndefinedVal (or the overflow is disallowed by some other mechanism). If I was right and we don't have such code yet, then it might be good to add some...

I think we don't have any mechanisms like these. This was just a theoretical comment if I understood it right.

Xazax-hun · 2024-10-31T11:19:53Z

I think we don't have any mechanisms like these. This was just a theoretical comment if I understood it right.

Actually, both. Yes, we don't have many of these in the engine as of today, but we do model some overflows in some checkers, like the bitwise shifts checker. It would be nice if we could write that logic once, and reuse it in constant folding as well, but this is really theoretical at this point.

llvm-ci · 2024-10-31T12:52:28Z

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-bootstrap-asan running on sanitizer-buildbot1 while building clang at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/52/builds/3353

Here is the relevant piece of the build log for the reference

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 86709 of 86710 tests, 88 workers --
Testing:  0.. 10
FAIL: Clang :: Interpreter/inline-virtual.cpp (12586 of 86709)
******************** TEST 'Clang :: Interpreter/inline-virtual.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 6: cat /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation      | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ cat /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
RUN: at line 8: cat /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation      -Xcc -O2 | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ cat /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation -Xcc -O2
JIT session error: In graph incr_module_25-jitted-objectbuffer, section .text.startup: relocation target "__dso_handle" at address 0x76991e50e000 is out of range of Delta32 fixup at 0x7a991ee2f039 (<anonymous block> @ 0x7a991ee2f010 + 0x29)
error: Failed to materialize symbols: { (main, { __orc_init_func.incr_module_25, a2, $.incr_module_25.__inits.0 }) }
error: Failed to materialize symbols: { (main, { __orc_init_func.incr_module_25 }) }
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp:26:11: error: CHECK: expected string not found in input
// CHECK: ~A(2)
          ^
<stdin>:1:262: note: scanning from here
clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl... clang-repl> clang-repl... clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> ~A(1)
                                                                                                                                                                                                                                                                     ^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl... clang-repl> clang-repl... clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> ~A(1) 
check:26                                                                                                                                                                                                                                                                          X error: no match found
          2: clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl>  
check:26     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

--

Step 10 (stage2/asan check) failure: stage2/asan check (failure)
...
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/lld-link
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 86709 of 86710 tests, 88 workers --
Testing:  0.. 10
FAIL: Clang :: Interpreter/inline-virtual.cpp (12586 of 86709)
******************** TEST 'Clang :: Interpreter/inline-virtual.cpp' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 6: cat /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation      | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ cat /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
RUN: at line 8: cat /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation      -Xcc -O2 | /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ cat /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/FileCheck /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp
+ /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm_build_asan/bin/clang-repl -Xcc -fno-rtti -Xcc -fno-sized-deallocation -Xcc -O2
JIT session error: In graph incr_module_25-jitted-objectbuffer, section .text.startup: relocation target "__dso_handle" at address 0x76991e50e000 is out of range of Delta32 fixup at 0x7a991ee2f039 (<anonymous block> @ 0x7a991ee2f010 + 0x29)
error: Failed to materialize symbols: { (main, { __orc_init_func.incr_module_25, a2, $.incr_module_25.__inits.0 }) }
error: Failed to materialize symbols: { (main, { __orc_init_func.incr_module_25 }) }
/home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp:26:11: error: CHECK: expected string not found in input
// CHECK: ~A(2)
          ^
<stdin>:1:262: note: scanning from here
clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl... clang-repl> clang-repl... clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> ~A(1)
                                                                                                                                                                                                                                                                     ^

Input file: <stdin>
Check file: /home/b/sanitizer-x86_64-linux-bootstrap-asan/build/llvm-project/clang/test/Interpreter/inline-virtual.cpp

-dump-input=help explains the following input dump.

Input was:
<<<<<<
          1: clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl... clang-repl> clang-repl... clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> ~A(1) 
check:26                                                                                                                                                                                                                                                                          X error: no match found
          2: clang-repl> clang-repl> clang-repl> clang-repl> clang-repl> clang-repl>  
check:26     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>>

--

SValBuilder::getKnownValue, getMinValue, getMaxValue use SValBuilder::simplifySVal. simplifySVal does repeated simplification until a fixed-point is reached. A single step is done by SimpleSValBuilder::simplifySValOnce, using a Simplifier visitor. That will basically decompose SymSymExprs, and apply constant folding using the constraints we have in the State. Once it decomposes a SymSymExpr, it simplifies both sides and then uses the SValBuilder::evalBinOp to reconstruct the same - but now simpler - SymSymExpr, while applying some caching to remain performant. This decomposition, and then the subsequent re-composition poses new challenges to the SValBuilder::evalBinOp, which is built to handle expressions coming from real C/C++ code, thus applying some implicit assumptions. One previous assumption was that nobody would form an expression like "((int*)0) - q" (where q is an int pointer), because it doesn't really makes sense to write code like that. However, during simplification, we may end up with a call to evalBinOp similar to this. To me, simplifying a SymbolRef should never result in Unknown or Undef, unless it was Unknown or Undef initially or, during simplification we realized that it's a division by zero once we did the constant folding, etc. In the following case the simplified SVal should not become UnknownVal: ```c++ void top(char *p, char *q) { int diff = p - q; // diff: reg<p> - reg<q> if (!p) // p: NULL simplify(diff); // diff after simplification should be: 0(loc) - reg<q> } ``` Returning Unknown from the simplifySVal can weaken analysis precision in other places too, such as in SValBuilder::getKnownValue, getMinValue, or getMaxValue because we call simplifySVal before doing anything else. For nonloc::SymbolVals, this loss of precision is critical, because for those the SymbolRef carries an accurate type of the encoded computation, thus we should at least have a conservative upper or lower bound that we could return from getMinValue or getMaxValue - yet we would just return nullptr. ```c++ const llvm::APSInt *SimpleSValBuilder::getKnownValue(ProgramStateRef state, SVal V) { return getConstValue(state, simplifySVal(state, V)); } const llvm::APSInt *SimpleSValBuilder::getMinValue(ProgramStateRef state, SVal V) { V = simplifySVal(state, V); if (const llvm::APSInt *Res = getConcreteValue(V)) return Res; if (SymbolRef Sym = V.getAsSymbol()) return state->getConstraintManager().getSymMinVal(state, Sym); return nullptr; } ``` For now, I don't plan to make the simplification bullet-proof, I'm just explaining why I made this change and what you need to look out for in the future if you see a similar issue. CPP-5750

steakhal added the clang:static analyzer label Oct 30, 2024

steakhal requested review from Xazax-hun, haoNoQ, Szelethus and NagyDonat October 30, 2024 12:59

llvmbot added the clang Clang issues not falling into any other category label Oct 30, 2024

Xazax-hun reviewed Oct 30, 2024

View reviewed changes

Xazax-hun approved these changes Oct 30, 2024

View reviewed changes

steakhal added 2 commits October 30, 2024 16:55

Merge remote-tracking branch 'llvm' into bb/fix-simplify-to-unknown

6569496

Fix build

a908396

steakhal merged commit e67e03a into llvm:main Oct 31, 2024
8 checks passed

steakhal deleted the bb/fix-simplify-to-unknown branch October 31, 2024 10:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[analyzer] EvalBinOpLL should return Unknown less often #114222

[analyzer] EvalBinOpLL should return Unknown less often #114222

Uh oh!

steakhal commented Oct 30, 2024

Uh oh!

steakhal commented Oct 30, 2024

Uh oh!

llvmbot commented Oct 30, 2024 •

edited

Loading

Uh oh!

Xazax-hun left a comment

Uh oh!

Xazax-hun commented Oct 30, 2024

Uh oh!

steakhal commented Oct 30, 2024

Uh oh!

Uh oh!

NagyDonat commented Oct 31, 2024

Uh oh!

steakhal commented Oct 31, 2024

Uh oh!

Xazax-hun commented Oct 31, 2024

Uh oh!

llvm-ci commented Oct 31, 2024

Uh oh!

Uh oh!

[analyzer] EvalBinOpLL should return Unknown less often #114222

[analyzer] EvalBinOpLL should return Unknown less often #114222

Uh oh!

Conversation

steakhal commented Oct 30, 2024

Uh oh!

steakhal commented Oct 30, 2024

Uh oh!

llvmbot commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Xazax-hun left a comment

Choose a reason for hiding this comment

Uh oh!

Xazax-hun commented Oct 30, 2024

Uh oh!

steakhal commented Oct 30, 2024

Uh oh!

Uh oh!

NagyDonat commented Oct 31, 2024

Uh oh!

steakhal commented Oct 31, 2024

Uh oh!

Xazax-hun commented Oct 31, 2024

Uh oh!

llvm-ci commented Oct 31, 2024

Uh oh!

Uh oh!

llvmbot commented Oct 30, 2024 •

edited

Loading