Skip to content

[clang][dataflow] Refactor PropagateResultObject() with a switch statement. #88865

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 18, 2024

Conversation

martinboehme
Copy link
Contributor

See also discussion in #88726.

Copy link

github-actions bot commented Apr 16, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@martinboehme
Copy link
Contributor Author

martinboehme commented Apr 16, 2024

Here's a draft that shows what PropagateResultObject() looks like when refactored using a switch statement.

I'm not sure if this is an improvement or not. I do see how this makes the case distinction clearer -- OTOH, there's duplication between the switch cases and the casts, for example:

      case Stmt::BinaryOperatorClass: {
        auto *Op = cast<BinaryOperator>(E);

Would appreciate opinions.

@martinboehme martinboehme force-pushed the propagate_result_object_switch branch from 3da6980 to 440ace1 Compare April 16, 2024 11:02
@ymand
Copy link
Collaborator

ymand commented Apr 16, 2024

Clearly, this is a matter of taste, so I would defer to your opinion, since you are the primary maintainer of this code. But, personally, I prefer this style since it makes clear that the body of the function is a single case analysis, which is not obvious from the series of if statements. It's unfortunate that the enum syntax is so bulky (the need for Stmt:: and the Class suffix). The casts don't bother me (well, any more than C++'s general lack of built-in datatype and pattern matching support bother me :)).

@martinboehme martinboehme marked this pull request as ready for review April 16, 2024 12:40
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Apr 16, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 16, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-analysis

Author: None (martinboehme)

Changes

See also discussion in #88726.


Full diff: https://github.com/llvm/llvm-project/pull/88865.diff

1 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+41-27)
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index ee2581143e1141..7ef6b9d1df5182 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -500,21 +500,43 @@ class ResultObjectVisitor : public RecursiveASTVisitor<ResultObjectVisitor> {
 
     ResultObjectMap[E] = Loc;
 
+    switch (E->getStmtClass()) {
     // The following AST node kinds are "original initializers": They are the
     // lowest-level AST node that initializes a given object, and nothing
     // below them can initialize the same object (or part of it).
-    if (isa<CXXConstructExpr>(E) || isa<CallExpr>(E) || isa<LambdaExpr>(E) ||
-        isa<CXXDefaultArgExpr>(E) || isa<CXXDefaultInitExpr>(E) ||
-        isa<CXXStdInitializerListExpr>(E)) {
+    case Stmt::CXXConstructExprClass:
+    case Stmt::CallExprClass:
+    case Stmt::LambdaExprClass:
+    case Stmt::CXXDefaultArgExprClass:
+    case Stmt::CXXDefaultInitExprClass:
+    case Stmt::CXXStdInitializerListExprClass:
+      return;
+    case Stmt::BinaryOperatorClass: {
+      auto *Op = cast<BinaryOperator>(E);
+      if (Op->getOpcode() == BO_Cmp) {
+        // Builtin `<=>` returns a `std::strong_ordering` object. We
+        // consider this to be an "original" initializer too (see above).
+        return;
+      }
+      if (Op->isCommaOp()) {
+        PropagateResultObject(Op->getRHS(), Loc);
+        return;
+      }
+      // We don't expect any other binary operators to produce a record
+      // prvalue, so if we get here, we've hit some case we don't know
+      // about.
+      assert(false);
       return;
     }
-    if (auto *Op = dyn_cast<BinaryOperator>(E);
-        Op && Op->getOpcode() == BO_Cmp) {
-      // Builtin `<=>` returns a `std::strong_ordering` object.
+    case Stmt::BinaryConditionalOperatorClass:
+    case Stmt::ConditionalOperatorClass: {
+      auto *Cond = cast<AbstractConditionalOperator>(E);
+      PropagateResultObject(Cond->getTrueExpr(), Loc);
+      PropagateResultObject(Cond->getFalseExpr(), Loc);
       return;
     }
-
-    if (auto *InitList = dyn_cast<InitListExpr>(E)) {
+    case Stmt::InitListExprClass: {
+      auto *InitList = cast<InitListExpr>(E);
       if (!InitList->isSemanticForm())
         return;
       if (InitList->isTransparent()) {
@@ -544,28 +566,20 @@ class ResultObjectVisitor : public RecursiveASTVisitor<ResultObjectVisitor> {
       }
       return;
     }
-
-    if (auto *Op = dyn_cast<BinaryOperator>(E); Op && Op->isCommaOp()) {
-      PropagateResultObject(Op->getRHS(), Loc);
+    default: {
+      // All other expression nodes that propagate a record prvalue should
+      // have exactly one child.
+      SmallVector<Stmt *, 1> Children(E->child_begin(), E->child_end());
+      LLVM_DEBUG({
+        if (Children.size() != 1)
+          E->dump();
+      });
+      assert(Children.size() == 1);
+      for (Stmt *S : Children)
+        PropagateResultObject(cast<Expr>(S), Loc);
       return;
     }
-
-    if (auto *Cond = dyn_cast<AbstractConditionalOperator>(E)) {
-      PropagateResultObject(Cond->getTrueExpr(), Loc);
-      PropagateResultObject(Cond->getFalseExpr(), Loc);
-      return;
     }
-
-    // All other expression nodes that propagate a record prvalue should have
-    // exactly one child.
-    SmallVector<Stmt *, 1> Children(E->child_begin(), E->child_end());
-    LLVM_DEBUG({
-      if (Children.size() != 1)
-        E->dump();
-    });
-    assert(Children.size() == 1);
-    for (Stmt *S : Children)
-      PropagateResultObject(cast<Expr>(S), Loc);
   }
 
 private:

@martinboehme
Copy link
Contributor Author

Clearly, this is a matter of taste, so I would defer to your opinion, since you are the primary maintainer of this code. But, personally, I prefer this style since it makes clear that the body of the function is a single case analysis, which is not obvious from the series of if statements. It's unfortunate that the enum syntax is so bulky (the need for Stmt:: and the Class suffix). The casts don't bother me (well, any more than C++'s general lack of built-in datatype and pattern matching support bother me :)).

I think maybe I'm coming round to your point of view (or, at least, I no longer clearly believe that the existing version is better).

I'm making this a non-draft PR and will add @Xazax-hun for an additional opinion. I have some other PRs in flight though that also touch PropagateResultObject(), and my plan is to land those first. In the meantime, I think the direction of this PR is clear though, so I think it makes sense to review.

@Xazax-hun
Copy link
Collaborator

It's unfortunate that the enum syntax is so bulky (the need for Stmt:: and the Class suffix).

Once we can use C++20, it could get a bit better thanks to https://en.cppreference.com/w/cpp/language/enum#Using-enum-declaration.

I'm making this a non-draft PR and will add @Xazax-hun for an additional opinion.

I also don't have a strong opinion; I am fine with both styles. That being said, if C++ gets proper pattern matching in the future, the switch might be easier to refactor.

@martinboehme
Copy link
Contributor Author

It's unfortunate that the enum syntax is so bulky (the need for Stmt:: and the Class suffix).

Once we can use C++20, it could get a bit better thanks to https://en.cppreference.com/w/cpp/language/enum#Using-enum-declaration.

Good point, thanks.

I'm making this a non-draft PR and will add @Xazax-hun for an additional opinion.

I also don't have a strong opinion; I am fine with both styles. That being said, if C++ gets proper pattern matching in the future, the switch might be easier to refactor.

Thanks for your input!

So I think given that you and I are fine with both options and @ymand has a preference for the switch statement, I'll go with the switch statement.

In the meantime, I've landed two other PRs that touch PropagateResultObject() (#88872 and #88875). I'll integrate those changes in this PR and will ping the review thread once that has happened.

@martinboehme martinboehme force-pushed the propagate_result_object_switch branch from 440ace1 to 69f4445 Compare April 17, 2024 08:04
@martinboehme martinboehme requested review from ymand and Xazax-hun April 17, 2024 08:04
@martinboehme
Copy link
Contributor Author

Ready for review.

Copy link
Collaborator

@ymand ymand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@martinboehme martinboehme merged commit da579ad into llvm:main Apr 18, 2024
@martinboehme
Copy link
Contributor Author

Didn't notice that there were failing tests in CI. Reverting.

martinboehme added a commit that referenced this pull request Apr 18, 2024
…witch statement." (#89176)

Reverts #88865

There were failing tests in the CI that I didn't notice. Sorry.
@martinboehme
Copy link
Contributor Author

First of all, a followup: I should of course have noticed the failling CI tests, but a contributing factor was that I was locally running tests in Release mode, i.e. with assert() compiled out.

I've now looked at why tests fail, and it is because when using E->getStmtClass(), we need to enumerate all possible sub-classes that we're interested in, whereas when using isa / dyn_cast, it's enough to switch on the base class.

In practical terms, this means that to get complete coverage for what used to be covered by isa<CXXConstructExpr>(E), we need to check not only for CXXConstructExprClass but also CXXTemporaryObjectExprClass. To get coverage for what used to be covered by isa<CallExpr>(E), we need to check not only for CallExprClass but also for CXXMemberCallExprClass and CXXOperatorCallExprClass.

In my mind, this shifts the tradeoff: Having to enumerate all possible sub-classes adds verbosity and carries the risk of forgetting a subclass, particularly if subclasses are added in the future.

I would therefore propose to stay with the existing implementation as I'm no longer convinced that the switch statement is the better option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants