Skip to content

Commit e67e03a

Browse files
authored
[analyzer] EvalBinOpLL should return Unknown less often (#114222)
SValBuilder::getKnownValue, getMinValue, getMaxValue use SValBuilder::simplifySVal. simplifySVal does repeated simplification until a fixed-point is reached. A single step is done by SimpleSValBuilder::simplifySValOnce, using a Simplifier visitor. That will basically decompose SymSymExprs, and apply constant folding using the constraints we have in the State. Once it decomposes a SymSymExpr, it simplifies both sides and then uses the SValBuilder::evalBinOp to reconstruct the same - but now simpler - SymSymExpr, while applying some caching to remain performant. This decomposition, and then the subsequent re-composition poses new challenges to the SValBuilder::evalBinOp, which is built to handle expressions coming from real C/C++ code, thus applying some implicit assumptions. One previous assumption was that nobody would form an expression like "((int*)0) - q" (where q is an int pointer), because it doesn't really makes sense to write code like that. However, during simplification, we may end up with a call to evalBinOp similar to this. To me, simplifying a SymbolRef should never result in Unknown or Undef, unless it was Unknown or Undef initially or, during simplification we realized that it's a division by zero once we did the constant folding, etc. In the following case the simplified SVal should not become UnknownVal: ```c++ void top(char *p, char *q) { int diff = p - q; // diff: reg<p> - reg<q> if (!p) // p: NULL simplify(diff); // diff after simplification should be: 0(loc) - reg<q> } ``` Returning Unknown from the simplifySVal can weaken analysis precision in other places too, such as in SValBuilder::getKnownValue, getMinValue, or getMaxValue because we call simplifySVal before doing anything else. For nonloc::SymbolVals, this loss of precision is critical, because for those the SymbolRef carries an accurate type of the encoded computation, thus we should at least have a conservative upper or lower bound that we could return from getMinValue or getMaxValue - yet we would just return nullptr. ```c++ const llvm::APSInt *SimpleSValBuilder::getKnownValue(ProgramStateRef state, SVal V) { return getConstValue(state, simplifySVal(state, V)); } const llvm::APSInt *SimpleSValBuilder::getMinValue(ProgramStateRef state, SVal V) { V = simplifySVal(state, V); if (const llvm::APSInt *Res = getConcreteValue(V)) return Res; if (SymbolRef Sym = V.getAsSymbol()) return state->getConstraintManager().getSymMinVal(state, Sym); return nullptr; } ``` For now, I don't plan to make the simplification bullet-proof, I'm just explaining why I made this change and what you need to look out for in the future if you see a similar issue. CPP-5750
1 parent ccb7cc3 commit e67e03a

File tree

3 files changed

+108
-3
lines changed

3 files changed

+108
-3
lines changed

clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -860,11 +860,12 @@ SVal SimpleSValBuilder::evalBinOpLL(ProgramStateRef state,
860860
// If one of the operands is a symbol and the other is a constant,
861861
// build an expression for use by the constraint manager.
862862
if (SymbolRef rSym = rhs.getAsLocSymbol()) {
863-
// We can only build expressions with symbols on the left,
864-
// so we need a reversible operator.
865-
if (!BinaryOperator::isComparisonOp(op) || op == BO_Cmp)
863+
if (op == BO_Cmp)
866864
return UnknownVal();
867865

866+
if (!BinaryOperator::isComparisonOp(op))
867+
return makeNonLoc(L.getValue(), op, rSym, resultTy);
868+
868869
op = BinaryOperator::reverseComparisonOp(op);
869870
return makeNonLoc(rSym, op, L.getValue(), resultTy);
870871
}

clang/unittests/StaticAnalyzer/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ add_clang_unittest(StaticAnalysisTests
2020
RegisterCustomCheckersTest.cpp
2121
StoreTest.cpp
2222
SymbolReaperTest.cpp
23+
SValSimplifyerTest.cpp
2324
SValTest.cpp
2425
TestReturnValueUnderConstruction.cpp
2526
Z3CrosscheckOracleTest.cpp
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
//===- unittests/StaticAnalyzer/SValSimplifyerTest.cpp --------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#include "CheckerRegistration.h"
10+
#include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
11+
#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
12+
#include "clang/StaticAnalyzer/Core/Checker.h"
13+
#include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
14+
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
15+
#include "clang/StaticAnalyzer/Frontend/AnalysisConsumer.h"
16+
#include "clang/StaticAnalyzer/Frontend/CheckerRegistry.h"
17+
#include "llvm/ADT/Twine.h"
18+
#include "llvm/Support/raw_ostream.h"
19+
#include "gtest/gtest.h"
20+
21+
using namespace clang;
22+
using namespace ento;
23+
24+
static std::string toString(SVal V) {
25+
std::string Result;
26+
llvm::raw_string_ostream Stream(Result);
27+
V.dumpToStream(Stream);
28+
return Result;
29+
}
30+
31+
static void replace(std::string &Content, StringRef Substr,
32+
StringRef Replacement) {
33+
std::size_t Pos = 0;
34+
while ((Pos = Content.find(Substr, Pos)) != std::string::npos) {
35+
Content.replace(Pos, Substr.size(), Replacement);
36+
Pos += Replacement.size();
37+
}
38+
}
39+
40+
namespace {
41+
42+
class SimplifyChecker : public Checker<check::PreCall> {
43+
const BugType Bug{this, "SimplifyChecker"};
44+
const CallDescription SimplifyCall{CDM::SimpleFunc, {"simplify"}, 1};
45+
46+
void report(CheckerContext &C, const Expr *E, StringRef Description) const {
47+
PathDiagnosticLocation Loc(E->getExprLoc(), C.getSourceManager());
48+
auto Report = std::make_unique<BasicBugReport>(Bug, Description, Loc);
49+
C.emitReport(std::move(Report));
50+
}
51+
52+
public:
53+
void checkPreCall(const CallEvent &Call, CheckerContext &C) const {
54+
if (!SimplifyCall.matches(Call))
55+
return;
56+
const Expr *Arg = Call.getArgExpr(0);
57+
SVal Val = C.getSVal(Arg);
58+
SVal SimplifiedVal = C.getSValBuilder().simplifySVal(C.getState(), Val);
59+
std::string Subject = toString(Val);
60+
std::string Simplified = toString(SimplifiedVal);
61+
std::string Message = (llvm::Twine{Subject} + " -> " + Simplified).str();
62+
report(C, Arg, Message);
63+
}
64+
};
65+
} // namespace
66+
67+
static void addSimplifyChecker(AnalysisASTConsumer &AnalysisConsumer,
68+
AnalyzerOptions &AnOpts) {
69+
AnOpts.CheckersAndPackages = {{"SimplifyChecker", true}};
70+
AnalysisConsumer.AddCheckerRegistrationFn([](CheckerRegistry &Registry) {
71+
Registry.addChecker<SimplifyChecker>("SimplifyChecker", "EmptyDescription",
72+
"EmptyDocsUri");
73+
});
74+
}
75+
76+
static void runThisCheckerOnCode(const std::string &Code, std::string &Diags) {
77+
ASSERT_TRUE(runCheckerOnCode<addSimplifyChecker>(Code, Diags,
78+
/*OnlyEmitWarnings=*/true));
79+
ASSERT_FALSE(Diags.empty());
80+
ASSERT_EQ(Diags.back(), '\n');
81+
Diags.pop_back();
82+
}
83+
84+
namespace {
85+
86+
TEST(SValSimplifyerTest, LHSConstrainedNullPtrDiff) {
87+
constexpr auto Code = R"cpp(
88+
template <class T> void simplify(T);
89+
void LHSConstrainedNullPtrDiff(char *p, char *q) {
90+
int diff = p - q;
91+
if (!p)
92+
simplify(diff);
93+
})cpp";
94+
95+
std::string Diags;
96+
runThisCheckerOnCode(Code, Diags);
97+
replace(Diags, "(reg_$0<char * p>)", "reg_p");
98+
replace(Diags, "(reg_$1<char * q>)", "reg_q");
99+
// This should not be simplified to "Unknown".
100+
EXPECT_EQ(Diags, "SimplifyChecker: reg_p - reg_q -> 0U - reg_q");
101+
}
102+
103+
} // namespace

0 commit comments

Comments
 (0)