Skip to content

[analyzer] Keep alive short-circuiting condition subexpressions in a conditional #100745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2024

Conversation

necto
Copy link
Contributor

@necto necto commented Jul 26, 2024

Fix the false negative caused by state merging in the evaluation of a short-circuiting expression inside the condition of a ternary operator. The fixed symptom is that CSA always evaluates (x || x) ? n : m to m.

This change forces the analyzer to consider all logical expressions prone to short-circuiting alive until the entire conditional expression is evaluated. Here is why.

By default, LiveVariables analysis marks only direct subexpressions as live relative to any expression. So for a ? b : c it will consider a, b, and c alive when evaluating the ternary operator expression.

To explore both possibilities opened by a ternary operator, it is important to keep something different about the exploded nodes created after the evaluation of its branches. These two nodes come to the same location, so they must have different states. Otherwise, they will be considered identical and can engender only one outcome. ExprEngine::visitGuardedExpr chooses the first predecessor exploded node to carry the value of the conditional expression. It works well in the case of a simple condition, because when a ? b : c is evaluated, a is kept alive, so the two branches differ in the value of a.

However, before this patch is applied, this strategy breaks for (x || x) ? n : m. x is not a direct child of the ternary expression. Due to short-circuiting, once x is assumed to be true, evaluation jumps directly to n and then to the result of the entire ternary expression. Given that the result of the entire condition (x || x) is not constructed, and x is not kept alive, the difference between the path coming through n and through m disappears. As a result, exploded nodes coming from the "true expression" and the "false expression" engender identical successors and merge the execution paths.

…conditional

Fix the false negative caused by state merging in the evaluation of a short-circuiting expression inside the condition of a ternary operator. The fixed symptom is that CSA always evaluates `(x || x) ? n : m` to `m`.

This change forces the analyzer to consider all logical expressions prone to short-circuiting alive until the entire conditional expression is evaluated. Here is why.

By default, LiveVariables analysis marks only direct subexpressions as live relative to any expression. So for `a ? b : c` it will consider `a`, `b`, and `c` alive when evaluating the ternary operator expression.

To explore both possibilities opened by a ternary operator, it is important to keep something different about the exploded nodes created after the evaluation of its branches. These two nodes come to the same location, so they must have different states. Otherwise, they will be considered identical and can engender only one outcome. `ExprEngine::visitGuardedExpr` chooses the first predecessor exploded node to carry the value of the conditional expression. It works well in the case of a simple condition, because when `a ? b : c` is evaluated, `a` is kept alive, so the two branches differ in the value of `a`.

However, before this patch is applied, this strategy breaks for `(x || x) ? n : m`. `x` is not a direct child of the ternary expression. Due to short-circuiting, once `x` is assumed to be `false`, evaluation jumps directly to `n` and then to the result of the entire ternary expression. Given that the result of the entire condition `(x || x)` is not constructed, and `x` is not kept alive, the difference between the path coming through `n` and through `m` disappears. As a result, exploded nodes coming from the "true expression" and the "false expression" engender identical successors and merge the execution paths.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:analysis labels Jul 26, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 26, 2024

@llvm/pr-subscribers-clang-static-analyzer-1
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-analysis

Author: Arseniy Zaostrovnykh (necto)

Changes

Fix the false negative caused by state merging in the evaluation of a short-circuiting expression inside the condition of a ternary operator. The fixed symptom is that CSA always evaluates (x || x) ? n : m to m.

This change forces the analyzer to consider all logical expressions prone to short-circuiting alive until the entire conditional expression is evaluated. Here is why.

By default, LiveVariables analysis marks only direct subexpressions as live relative to any expression. So for a ? b : c it will consider a, b, and c alive when evaluating the ternary operator expression.

To explore both possibilities opened by a ternary operator, it is important to keep something different about the exploded nodes created after the evaluation of its branches. These two nodes come to the same location, so they must have different states. Otherwise, they will be considered identical and can engender only one outcome. ExprEngine::visitGuardedExpr chooses the first predecessor exploded node to carry the value of the conditional expression. It works well in the case of a simple condition, because when a ? b : c is evaluated, a is kept alive, so the two branches differ in the value of a.

However, before this patch is applied, this strategy breaks for (x || x) ? n : m. x is not a direct child of the ternary expression. Due to short-circuiting, once x is assumed to be false, evaluation jumps directly to n and then to the result of the entire ternary expression. Given that the result of the entire condition (x || x) is not constructed, and x is not kept alive, the difference between the path coming through n and through m disappears. As a result, exploded nodes coming from the "true expression" and the "false expression" engender identical successors and merge the execution paths.


Full diff: https://github.com/llvm/llvm-project/pull/100745.diff

3 Files Affected:

  • (modified) clang/lib/Analysis/LiveVariables.cpp (+37-1)
  • (modified) clang/test/Analysis/live-stmts.cpp (+109)
  • (added) clang/test/Analysis/short-circuiting-eval.cpp (+39)
diff --git a/clang/lib/Analysis/LiveVariables.cpp b/clang/lib/Analysis/LiveVariables.cpp
index 6d03dd05ca3d2..481932ee59c8e 100644
--- a/clang/lib/Analysis/LiveVariables.cpp
+++ b/clang/lib/Analysis/LiveVariables.cpp
@@ -214,6 +214,22 @@ static void AddLiveExpr(llvm::ImmutableSet<const Expr *> &Set,
   Set = F.add(Set, LookThroughExpr(E));
 }
 
+/// Add as a live expression all individual conditions in a logical expression.
+/// For example, for the expression:
+/// "(a < b) || (c && d && ((e || f) != (g && h)))"
+/// the following expressions will be added as live:
+/// "a < b", "c", "d", "((e || f) != (g && h))"
+static void AddAllConditionalTerms(llvm::ImmutableSet<const Expr *> &Set,
+                                   llvm::ImmutableSet<const Expr *>::Factory &F,
+                                   const Expr *Cond) {
+  AddLiveExpr(Set, F, Cond);
+  if (auto const *BO = dyn_cast<BinaryOperator>(Cond->IgnoreParens());
+      BO && BO->isLogicalOp()) {
+    AddAllConditionalTerms(Set, F, BO->getLHS());
+    AddAllConditionalTerms(Set, F, BO->getRHS());
+  }
+}
+
 void TransferFunctions::Visit(Stmt *S) {
   if (observer)
     observer->observeStmt(S, currentBlock, val);
@@ -313,7 +329,27 @@ void TransferFunctions::Visit(Stmt *S) {
       AddLiveExpr(val.liveExprs, LV.ESetFact, cast<ForStmt>(S)->getCond());
       return;
     }
-
+    case Stmt::ConditionalOperatorClass: {
+      // Keep not only direct children alive, but also all the short-circuited
+      // parts of the condition. Short-circuiting evaluation may cause the
+      // conditional operator evaluation to skip the evaluation of the entire
+      // condtion expression, so the value of the entire condition expression is
+      // never computed.
+      //
+      // This makes a difference when we compare exploded nodes coming from true
+      // and false expressions with no side effects: the only difference in the
+      // state is the value of (part of) the condition.
+      //
+      // BinaryConditionalOperatorClass ('x ?: y') is not affected because it
+      // explicitly calculates the value of the entire condition expression (to
+      // possibly use as a value for the "true expr") even if it is
+      // short-circuited.
+      auto const *CO = cast<ConditionalOperator>(S);
+      AddAllConditionalTerms(val.liveExprs, LV.ESetFact, CO->getCond());
+      AddLiveExpr(val.liveExprs, LV.ESetFact, CO->getTrueExpr());
+      AddLiveExpr(val.liveExprs, LV.ESetFact, CO->getFalseExpr());
+      return;
+    }
   }
 
   // HACK + FIXME: What is this? One could only guess that this is an attempt to
diff --git a/clang/test/Analysis/live-stmts.cpp b/clang/test/Analysis/live-stmts.cpp
index 16954f30129f7..c60f522588e39 100644
--- a/clang/test/Analysis/live-stmts.cpp
+++ b/clang/test/Analysis/live-stmts.cpp
@@ -193,3 +193,112 @@ void test_lambda_refcapture() {
 // CHECK-NEXT: [ B2 (live expressions at block exit) ]
 // CHECK-EMPTY:
 // CHECK-EMPTY:
+
+int logicalOpInTernary(bool b) {
+  return (b || b) ? 0 : 1;
+}
+
+// [B6 (ENTRY)]
+//    |
+//    V
+// [B5 (b || ...)]
+//   |            \
+//   |             |
+//   V             V
+// [B4 (b||b)] ? [B2 (0)] : [B3 (1)]
+//                \        /
+//                 ---|----
+//                    V
+//                   [B1] --> [B0 (EXIT)]
+//                  return
+
+// CHECK: [ B0 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B1 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B2 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: BinaryOperator {{.*}} '_Bool' '||'
+// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK:   `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B3 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: BinaryOperator {{.*}} '_Bool' '||'
+// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK:   `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B4 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: BinaryOperator {{.*}} '_Bool' '||'
+// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK:   `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B5 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: BinaryOperator {{.*}} '_Bool' '||'
+// CHECK: |-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: | `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK: `-ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK:   `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
+// CHECK-EMPTY:
+// CHECK-EMPTY:
+// CHECK: [ B6 (live expressions at block exit) ]
+// CHECK-EMPTY:
+// CHECK: ImplicitCastExpr {{.*}} '_Bool' <LValueToRValue>
+// CHECK: `-DeclRefExpr {{.*}} '_Bool' lvalue ParmVar {{.*}} 'b' '_Bool'
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 0
+// CHECK-EMPTY:
+// CHECK: IntegerLiteral {{.*}} 'int' 1
diff --git a/clang/test/Analysis/short-circuiting-eval.cpp b/clang/test/Analysis/short-circuiting-eval.cpp
new file mode 100644
index 0000000000000..d0f29a849ab1c
--- /dev/null
+++ b/clang/test/Analysis/short-circuiting-eval.cpp
@@ -0,0 +1,39 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core.DivideZero -verify %s
+
+int div0LogicalOpInTernary(bool b1) {
+  int y = (b1 || b1) ? 0 : 1;
+  return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0LogicalAndArith(bool b1, int x) {
+  int y = (b1 || (x < 3)) ? 0 : 1;
+  return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0NestedLogicalOp(bool b1) {
+  int y = (b1 && b1 || b1 && b1) ? 0 : 1;
+  return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0TernaryInTernary(bool b) {
+  int y = ((b || b) ? false : true) ? 0 : 1;
+  return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0LogicalOpParensInTernary(bool b1) {
+  int y = ((((b1)) || ((b1)))) ? 0 : 1;
+  return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0LogicalOpInsideStExpr(bool b1) {
+  int y = ({1; (b1 || b1);}) ? 0 : 1;
+  // expected-warning@-1 {{expression result unused}}
+  return 1 / y; // expected-warning {{Division by zero}}
+}
+
+int div0StExprInsideLogicalOp(bool b1) {
+  int y = (({1; b1;}) || ({1; b1;})) ? 0 : 1;
+  // expected-warning@-1 {{expression result unused}}
+  // expected-warning@-2 {{expression result unused}}
+  return 1 / y; // expected-warning {{Division by zero}}
+}

@steakhal
Copy link
Contributor

I already reviewed this downstream.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix, thanks for upstreaming this!

@steakhal
Copy link
Contributor

We'll merge this on Monday.

@steakhal steakhal merged commit 73c72f2 into llvm:main Jul 29, 2024
11 checks passed
@necto necto deleted the az/fix-short-circuit-fn branch July 29, 2024 06:44
Copy link
Collaborator

@Xazax-hun Xazax-hun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks for fixing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants