-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[analyzer] Keep alive short-circuiting condition subexpressions in a conditional #100745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…conditional Fix the false negative caused by state merging in the evaluation of a short-circuiting expression inside the condition of a ternary operator. The fixed symptom is that CSA always evaluates `(x || x) ? n : m` to `m`. This change forces the analyzer to consider all logical expressions prone to short-circuiting alive until the entire conditional expression is evaluated. Here is why. By default, LiveVariables analysis marks only direct subexpressions as live relative to any expression. So for `a ? b : c` it will consider `a`, `b`, and `c` alive when evaluating the ternary operator expression. To explore both possibilities opened by a ternary operator, it is important to keep something different about the exploded nodes created after the evaluation of its branches. These two nodes come to the same location, so they must have different states. Otherwise, they will be considered identical and can engender only one outcome. `ExprEngine::visitGuardedExpr` chooses the first predecessor exploded node to carry the value of the conditional expression. It works well in the case of a simple condition, because when `a ? b : c` is evaluated, `a` is kept alive, so the two branches differ in the value of `a`. However, before this patch is applied, this strategy breaks for `(x || x) ? n : m`. `x` is not a direct child of the ternary expression. Due to short-circuiting, once `x` is assumed to be `false`, evaluation jumps directly to `n` and then to the result of the entire ternary expression. Given that the result of the entire condition `(x || x)` is not constructed, and `x` is not kept alive, the difference between the path coming through `n` and through `m` disappears. As a result, exploded nodes coming from the "true expression" and the "false expression" engender identical successors and merge the execution paths.
@llvm/pr-subscribers-clang-static-analyzer-1 @llvm/pr-subscribers-clang-analysis Author: Arseniy Zaostrovnykh (necto) ChangesFix the false negative caused by state merging in the evaluation of a short-circuiting expression inside the condition of a ternary operator. The fixed symptom is that CSA always evaluates This change forces the analyzer to consider all logical expressions prone to short-circuiting alive until the entire conditional expression is evaluated. Here is why. By default, LiveVariables analysis marks only direct subexpressions as live relative to any expression. So for To explore both possibilities opened by a ternary operator, it is important to keep something different about the exploded nodes created after the evaluation of its branches. These two nodes come to the same location, so they must have different states. Otherwise, they will be considered identical and can engender only one outcome. However, before this patch is applied, this strategy breaks for Full diff: https://github.com/llvm/llvm-project/pull/100745.diff 3 Files Affected:
diff --git a/clang/lib/Analysis/LiveVariables.cpp b/clang/lib/Analysis/LiveVariables.cpp
index 6d03dd05ca3d2..481932ee59c8e 100644
--- a/clang/lib/Analysis/LiveVariables.cpp
+++ b/clang/lib/Analysis/LiveVariables.cpp
@@ -214,6 +214,22 @@ static void AddLiveExpr(llvm::ImmutableSet<const Expr *> &Set,
Set = F.add(Set, LookThroughExpr(E));
}
+/// Add as a live expression all individual conditions in a logical expression.
+/// For example, for the expression:
+/// "(a < b) || (c && d && ((e || f) != (g && h)))"
+/// the following expressions will be added as live:
+/// "a < b", "c", "d", "((e || f) != (g && h))"
+static void AddAllConditionalTerms(llvm::ImmutableSet<const Expr *> &Set,
+ llvm::ImmutableSet<const Expr *>::Factory &F,
+ const Expr *Cond) {
+ AddLiveExpr(Set, F, Cond);
+ if (auto const *BO = dyn_cast<BinaryOperator>(Cond->IgnoreParens());
+ BO && BO->isLogicalOp()) {
+ AddAllConditionalTerms(Set, F, BO->getLHS());
+ AddAllConditionalTerms(Set, F, BO->getRHS());
+ }
+}
+
void TransferFunctions::Visit(Stmt *S) {
if (observer)
observer->observeStmt(S, currentBlock, val);
@@ -313,7 +329,27 @@ void TransferFunctions::Visit(Stmt *S) {
AddLiveExpr(val.liveExprs, LV.ESetFact, cast<ForStmt>(S)->getCond());
return;
}
-
+ case Stmt::ConditionalOperatorClass: {
+ // Keep not only direct children alive, but also all the short-circuited
+ // parts of the condition. Short-circuiting evaluation may cause the
+ // conditional operator evaluation to skip the evaluation of the entire
+ // condtion expression, so the value of the entire condition expression is
+ // never computed.
+ //
+ // This makes a difference when we compare exploded nodes coming from true
+ // and false expressions with no side effects: the only difference in the
+ // state is the value of (part of) the condition.
+ //
+ // BinaryConditionalOperatorClass ('x ?: y') is not affected because it
+ // explicitly calculates the value of the entire condition expression (to
+ // possibly use as a value for the "true expr") even if it is
+ // short-circuited.
+ auto const *CO = cast<ConditionalOperator>(S);
+ AddAllConditionalTerms(val.liveExprs, LV.ESetFact, CO->getCond());
+ AddLiveExpr(val.liveExprs, LV.ESetFact, CO->getTrueExpr());
+ AddLiveExpr(val.liveExprs, LV.ESetFact, CO->getFalseExpr());
+ return;
+ }
}
// HACK + FIXME: What is this? One could only guess that this is an attempt to
diff --git a/clang/test/Analysis/live-stmts.cpp b/clang/test/Analysis/live-stmts.cpp
index 16954f30129f7..c60f522588e39 100644
--- a/clang/test/Analysis/live-stmts.cpp
+++ b/clang/test/Analysis/live-stmts.cpp
@@ -193,3 +193,112 @@ void test_lambda_refcapture() {
// CHECK-NEXT: [ B2 (live expressions at block exit) ]
// CHECK-EMPTY:
// CHECK-EMPTY:
+
+int logicalOpInTernary(bool b) {
+ return (b || b) ? 0 : 1;
+}
+
+// [B6 (ENTRY)]
+// |
+// V
+// [B5 (b || ...)]
+// | \
+// | |
+// V V
+// [B4 (b||b)] ? [B2 (0)] : [B3 (1)]
+// \ /
+// ---|----
+// V
+// [B1] --> [B0 (EXIT)]
+// return
+
+// CHECK: [ B0 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B1 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B2 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: BinaryOperator {{.*}} '_Bool' '||'
+// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B3 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: BinaryOperator {{.*}} '_Bool' '||'
+// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B4 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: BinaryOperator {{.*}} '_Bool' '||'
+// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B5 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: BinaryOperator {{.*}} '_Bool' '||'
+// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B6 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
diff --git a/clang/test/Analysis/short-circuiting-eval.cpp b/clang/test/Analysis/short-circuiting-eval.cpp
new file mode 100644
index 0000000000000..d0f29a849ab1c
--- /dev/null
+++ b/clang/test/Analysis/short-circuiting-eval.cpp
@@ -0,0 +1,39 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core.DivideZero -verify %s
+
+int div0LogicalOpInTernary(bool b1) {
+ int y = (b1 || b1) ? 0 : 1;
+ return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0LogicalAndArith(bool b1, int x) {
+ int y = (b1 || (x < 3)) ? 0 : 1;
+ return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0NestedLogicalOp(bool b1) {
+ int y = (b1 && b1 || b1 && b1) ? 0 : 1;
+ return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0TernaryInTernary(bool b) {
+ int y = ((b || b) ? false : true) ? 0 : 1;
+ return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0LogicalOpParensInTernary(bool b1) {
+ int y = ((((b1)) || ((b1)))) ? 0 : 1;
+ return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0LogicalOpInsideStExpr(bool b1) {
+ int y = ({1; (b1 || b1);}) ? 0 : 1;
+ // expected-warning@-1 {{expression result unused}}
+ return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0StExprInsideLogicalOp(bool b1) {
+ int y = (({1; b1;}) || ({1; b1;})) ? 0 : 1;
+ // expected-warning@-1 {{expression result unused}}
+ // expected-warning@-2 {{expression result unused}}
+ return 1 / y; // expected-warning {{Division by zero}}
+}
|
I already reviewed this downstream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix, thanks for upstreaming this!
We'll merge this on Monday. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, thanks for fixing!
Fix the false negative caused by state merging in the evaluation of a short-circuiting expression inside the condition of a ternary operator. The fixed symptom is that CSA always evaluates
(x || x) ? n : m
tom
.This change forces the analyzer to consider all logical expressions prone to short-circuiting alive until the entire conditional expression is evaluated. Here is why.
By default, LiveVariables analysis marks only direct subexpressions as live relative to any expression. So for
a ? b : c
it will considera
,b
, andc
alive when evaluating the ternary operator expression.To explore both possibilities opened by a ternary operator, it is important to keep something different about the exploded nodes created after the evaluation of its branches. These two nodes come to the same location, so they must have different states. Otherwise, they will be considered identical and can engender only one outcome.
ExprEngine::visitGuardedExpr
chooses the first predecessor exploded node to carry the value of the conditional expression. It works well in the case of a simple condition, because whena ? b : c
is evaluated,a
is kept alive, so the two branches differ in the value ofa
.However, before this patch is applied, this strategy breaks for
(x || x) ? n : m
.x
is not a direct child of the ternary expression. Due to short-circuiting, oncex
is assumed to betrue
, evaluation jumps directly ton
and then to the result of the entire ternary expression. Given that the result of the entire condition(x || x)
is not constructed, andx
is not kept alive, the difference between the path coming throughn
and throughm
disappears. As a result, exploded nodes coming from the "true expression" and the "false expression" engender identical successors and merge the execution paths.