Skip to content

[clang][ASTImporter] Improve structural equivalence of overloadable operators. #72242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 18, 2024

Conversation

balazske
Copy link
Collaborator

Operators that are overloadable may be parsed as CXXOperatorCallExpr or as UnaryOperator (or BinaryOperator). This depends on the context and can be different if a similar construct is imported into an existing AST. The two "forms" of the operator call AST nodes should be detected as equivalent to allow AST import of these cases.

This fix has probably other consequences because if a structure is imported that has CXXOperatorCallExpr into an AST with an existing similar structure that has UnaryOperator (or binary), the additional data in the CXXOperatorCallExpr node is lost at the import (because the existing node will be used). I am not sure if this can cause problems.

…perators.

Operators that are overloadable may be parsed as `CXXOperatorCallExpr`
or as `UnaryOperator` (or `BinaryOperator`). This depends on the context
and can be different if a similar construct is imported into an existing AST.
The two "forms" of the operator call AST nodes should be detected as
equivalent to allow AST import of these cases.

This fix has probably other consequences because if a structure is imported
that has `CXXOperatorCallExpr` into an AST with an existing similar structure
that has `UnaryOperator` (or binary), the additional data in the
`CXXOperatorCallExpr` node is lost at the import (because the existing node
will be used). I am not sure if this can cause problems.
@balazske balazske requested review from shafik and NagyDonat November 14, 2023 11:07
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Nov 14, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2023

@llvm/pr-subscribers-clang

Author: Balázs Kéri (balazske)

Changes

Operators that are overloadable may be parsed as CXXOperatorCallExpr or as UnaryOperator (or BinaryOperator). This depends on the context and can be different if a similar construct is imported into an existing AST. The two "forms" of the operator call AST nodes should be detected as equivalent to allow AST import of these cases.

This fix has probably other consequences because if a structure is imported that has CXXOperatorCallExpr into an AST with an existing similar structure that has UnaryOperator (or binary), the additional data in the CXXOperatorCallExpr node is lost at the import (because the existing node will be used). I am not sure if this can cause problems.


Full diff: https://github.com/llvm/llvm-project/pull/72242.diff

2 Files Affected:

  • (modified) clang/lib/AST/ASTStructuralEquivalence.cpp (+57)
  • (modified) clang/unittests/AST/StructuralEquivalenceTest.cpp (+170)
diff --git a/clang/lib/AST/ASTStructuralEquivalence.cpp b/clang/lib/AST/ASTStructuralEquivalence.cpp
index 6bb4bf14b873d7b..b937ff0579ca02d 100644
--- a/clang/lib/AST/ASTStructuralEquivalence.cpp
+++ b/clang/lib/AST/ASTStructuralEquivalence.cpp
@@ -98,6 +98,8 @@ static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
                                      QualType T1, QualType T2);
 static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
                                      Decl *D1, Decl *D2);
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+                                     const Stmt *S1, const Stmt *S2);
 static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
                                      const TemplateArgument &Arg1,
                                      const TemplateArgument &Arg2);
@@ -437,12 +439,67 @@ class StmtComparer {
 };
 } // namespace
 
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+                                     const UnaryOperator *E1,
+                                     const CXXOperatorCallExpr *E2) {
+  return UnaryOperator::getOverloadedOperator(E1->getOpcode()) ==
+             E2->getOperator() &&
+         IsStructurallyEquivalent(Context, E1->getSubExpr(), E2->getArg(0));
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+                                     const CXXOperatorCallExpr *E1,
+                                     const UnaryOperator *E2) {
+  return E1->getOperator() ==
+             UnaryOperator::getOverloadedOperator(E2->getOpcode()) &&
+         IsStructurallyEquivalent(Context, E1->getArg(0), E2->getSubExpr());
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+                                     const BinaryOperator *E1,
+                                     const CXXOperatorCallExpr *E2) {
+  return BinaryOperator::getOverloadedOperator(E1->getOpcode()) ==
+             E2->getOperator() &&
+         IsStructurallyEquivalent(Context, E1->getLHS(), E2->getArg(0)) &&
+         IsStructurallyEquivalent(Context, E1->getRHS(), E2->getArg(1));
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+                                     const CXXOperatorCallExpr *E1,
+                                     const BinaryOperator *E2) {
+  return E1->getOperator() ==
+             BinaryOperator::getOverloadedOperator(E2->getOpcode()) &&
+         IsStructurallyEquivalent(Context, E1->getArg(0), E2->getLHS()) &&
+         IsStructurallyEquivalent(Context, E1->getArg(1), E2->getRHS());
+}
+
 /// Determine structural equivalence of two statements.
 static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
                                      const Stmt *S1, const Stmt *S2) {
   if (!S1 || !S2)
     return S1 == S2;
 
+  // Check for statements with similar syntax but different AST.
+  // The unary and binary operators (like '+', '&') can be parsed as
+  // CXXOperatorCall too (and UnaryOperator or BinaryOperator).
+  // This depends on arguments with unresolved type and on the name lookup.
+  // The lookup results can be different in a "From" and "To" AST even if the
+  // compared structure is otherwise equivalent. For this reason we must treat
+  // these operators as equivalent even if they are represented by different AST
+  // nodes.
+  if (const auto *E2CXXOperatorCall = dyn_cast<CXXOperatorCallExpr>(S2)) {
+    if (const auto *E1Unary = dyn_cast<UnaryOperator>(S1))
+      return IsStructurallyEquivalent(Context, E1Unary, E2CXXOperatorCall);
+    if (const auto *E1Binary = dyn_cast<BinaryOperator>(S1))
+      return IsStructurallyEquivalent(Context, E1Binary, E2CXXOperatorCall);
+  }
+  if (const auto *E1CXXOperatorCall = dyn_cast<CXXOperatorCallExpr>(S1)) {
+    if (const auto *E2Unary = dyn_cast<UnaryOperator>(S2))
+      return IsStructurallyEquivalent(Context, E1CXXOperatorCall, E2Unary);
+    if (const auto *E2Binary = dyn_cast<BinaryOperator>(S2))
+      return IsStructurallyEquivalent(Context, E1CXXOperatorCall, E2Binary);
+  }
+
   // Compare the statements itself.
   StmtComparer Comparer(Context);
   if (!Comparer.IsEquivalent(S1, S2))
diff --git a/clang/unittests/AST/StructuralEquivalenceTest.cpp b/clang/unittests/AST/StructuralEquivalenceTest.cpp
index 44d950cfe758f14..539e3a4a2c01dca 100644
--- a/clang/unittests/AST/StructuralEquivalenceTest.cpp
+++ b/clang/unittests/AST/StructuralEquivalenceTest.cpp
@@ -2252,6 +2252,176 @@ TEST_F(StructuralEquivalenceStmtTest, UnaryOperatorDifferentOps) {
   EXPECT_FALSE(testStructuralMatch(t));
 }
 
+TEST_F(StructuralEquivalenceStmtTest,
+       CXXOperatorCallExprVsUnaryBinaryOperator) {
+  auto t = makeNamedDecls(
+      R"(
+      template <typename T, T x>
+      class A;
+      template <typename T, T x, T y>
+      void foo(
+        A<T, x + y>,
+        A<T, x - y>,
+        A<T, -x>,
+        A<T, x * y>,
+        A<T, *x>,
+        A<T, x / y>,
+        A<T, x % y>,
+        A<T, x ^ y>,
+        A<T, x & y>,
+        A<T, &x>,
+        A<T, x | y>,
+        A<T, ~x>,
+        A<T, !x>,
+        A<T, x < y>,
+        A<T, (x > y)>,
+        A<T, x << y>,
+        A<T, (x >> y)>,
+        A<T, x == y>,
+        A<T, x != y>,
+        A<T, x <= y>,
+        A<T, x >= y>,
+        A<T, x <=> y>,
+        A<T, x && y>,
+        A<T, x || y>,
+        A<T, ++x>,
+        A<T, --x>,
+        A<T, (x , y)>,
+        A<T, x ->* y>,
+        A<T, x -> y>
+      );
+      )",
+      R"(
+      struct Bar {
+        Bar& operator=(Bar&);
+        Bar& operator->();
+      };
+
+      Bar& operator+(Bar&, Bar&);
+      Bar& operator+(Bar&);
+      Bar& operator-(Bar&, Bar&);
+      Bar& operator-(Bar&);
+      Bar& operator*(Bar&, Bar&);
+      Bar& operator*(Bar&);
+      Bar& operator/(Bar&, Bar&);
+      Bar& operator%(Bar&, Bar&);
+      Bar& operator^(Bar&, Bar&);
+      Bar& operator&(Bar&, Bar&);
+      Bar& operator&(Bar&);
+      Bar& operator|(Bar&, Bar&);
+      Bar& operator~(Bar&);
+      Bar& operator!(Bar&);
+      Bar& operator<(Bar&, Bar&);
+      Bar& operator>(Bar&, Bar&);
+      Bar& operator+=(Bar&, Bar&);
+      Bar& operator-=(Bar&, Bar&);
+      Bar& operator*=(Bar&, Bar&);
+      Bar& operator/=(Bar&, Bar&);
+      Bar& operator%=(Bar&, Bar&);
+      Bar& operator^=(Bar&, Bar&);
+      Bar& operator&=(Bar&, Bar&);
+      Bar& operator|=(Bar&, Bar&);
+      Bar& operator<<(Bar&, Bar&);
+      Bar& operator>>(Bar&, Bar&);
+      Bar& operator<<=(Bar&, Bar&);
+      Bar& operator>>=(Bar&, Bar&);
+      Bar& operator==(Bar&, Bar&);
+      Bar& operator!=(Bar&, Bar&);
+      Bar& operator<=(Bar&, Bar&);
+      Bar& operator>=(Bar&, Bar&);
+      Bar& operator<=>(Bar&, Bar&);
+      Bar& operator&&(Bar&, Bar&);
+      Bar& operator||(Bar&, Bar&);
+      Bar& operator++(Bar&);
+      Bar& operator--(Bar&);
+      Bar& operator,(Bar&, Bar&);
+      Bar& operator->*(Bar&, Bar&);
+
+      template <typename T, T x>
+      class A;
+      template <typename T, T x, T y>
+      void foo(
+        A<T, x + y>,
+        A<T, x - y>,
+        A<T, -x>,
+        A<T, x * y>,
+        A<T, *x>,
+        A<T, x / y>,
+        A<T, x % y>,
+        A<T, x ^ y>,
+        A<T, x & y>,
+        A<T, &x>,
+        A<T, x | y>,
+        A<T, ~x>,
+        A<T, !x>,
+        A<T, x < y>,
+        A<T, (x > y)>,
+        A<T, x << y>,
+        A<T, (x >> y)>,
+        A<T, x == y>,
+        A<T, x != y>,
+        A<T, x <= y>,
+        A<T, x >= y>,
+        A<T, x <=> y>,
+        A<T, x && y>,
+        A<T, x || y>,
+        A<T, ++x>,
+        A<T, --x>,
+        A<T, (x , y)>,
+        A<T, x ->* y>,
+        A<T, x -> y>
+      );
+      )",
+      Lang_CXX20);
+  EXPECT_TRUE(testStructuralMatch(t));
+}
+
+TEST_F(StructuralEquivalenceStmtTest,
+       CXXOperatorCallExprVsUnaryBinaryOperatorNe) {
+  auto t = makeNamedDecls(
+      R"(
+      template <typename T, T x>
+      class A;
+      template <typename T, T x, T y>
+      void foo(
+        A<T, x + y>
+      );
+      )",
+      R"(
+      struct Bar;
+
+      Bar& operator+(Bar&, Bar&);
+
+      template <typename T, T x>
+      class A;
+      template <typename T, T x, T y>
+      void foo(
+        A<T, x - y>
+      );
+      )",
+      Lang_CXX11);
+  EXPECT_FALSE(testStructuralMatch(t));
+}
+
+TEST_F(StructuralEquivalenceStmtTest, NonTypeTemplateParm) {
+  auto t = makeNamedDecls(
+      R"(
+      template <typename T, T x>
+      class A;
+      template <typename T, T x, T y>
+      void foo(A<T, x>);
+      )",
+      R"(
+      template <typename T, T x>
+      class A;
+      template <typename T, T x, T y>
+      void foo(A<T, y>);
+      )",
+      Lang_CXX11);
+  // FIXME: These should not match,
+  EXPECT_TRUE(testStructuralMatch(t));
+}
+
 TEST_F(StructuralEquivalenceStmtTest, UnresolvedLookupDifferentName) {
   auto t = makeStmts(
       R"(

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you write that unifying these different but equivalent expressions could've unintended consequences, perhaps you should check the effects of this change on CTU analysis of modern (a.k.a. crazy) C++ projects (or publish the results if you already did so).

Otherwise I can't find any problem in this code, but I'm not familiar enough with these intricate details of the standard, so this opinion is not too relevant.

R"(
struct Bar;

Bar& operator+(Bar&, Bar&);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the presence of this operator+ relevant if this TU doesn't use operator+ at all?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know exactly what was the purpose, but likely it should be operator - instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is now updated to operator- (the first code should contain an CXXOperatorCallExpr, the second a BinaryOperator in the template arg 2 of A).

// Check for statements with similar syntax but different AST.
// The unary and binary operators (like '+', '&') can be parsed as
// CXXOperatorCall too (and UnaryOperator or BinaryOperator).
// This depends on arguments with unresolved type and on the name lookup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you perhaps elaborate this sentence a bit? Do I understand it correctly that e.g. in CXXOperatorCallExprVsUnaryBinaryOperator the presence of struct Bar in the second TU ensured that all the operators are parsed as CXXOperatorCalls, while in the first TU they are parsed as unary/binary operator expressions?

Copy link
Collaborator Author

@balazske balazske Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was coming from this post:
https://discourse.llvm.org/t/different-ast-translations-for-the-same-body-why/22741
The presence of an operator declaration is needed to change the parsed AST from BinaryOperator to CXXOverloadedOperatorCallExpr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link, it's really helpful. Perhaps it should be included in the source code comment as well, to help those who'll work on this code in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the updated comment is clear now.

@balazske
Copy link
Collaborator Author

I have updated the comment.
Probably I can test it with more projects after the other fixes with variable templates (#72841) are finished, then there may be less crashes.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarifications, it looks good to me now. (Except for the FIXME, but that's an independent issue and you're right that it should be handled in another commit.)

@balazske balazske merged commit 9ca1a08 into llvm:main Jan 18, 2024
@balazske balazske deleted the astimporter_cxxoperator_struct_eq branch January 18, 2024 08:20
ampandey-1995 pushed a commit to ampandey-1995/llvm-project that referenced this pull request Jan 19, 2024
…perators. (llvm#72242)

Operators that are overloadable may be parsed as `CXXOperatorCallExpr`
or as `UnaryOperator` (or `BinaryOperator`). This depends on the context
and can be different if a similar construct is imported into an existing
AST. The two "forms" of the operator call AST nodes should be detected
as equivalent to allow AST import of these cases.

This fix has probably other consequences because if a structure is
imported that has `CXXOperatorCallExpr` into an AST with an existing
similar structure that has `UnaryOperator` (or binary), the additional
data in the `CXXOperatorCallExpr` node is lost at the import (because
the existing node will be used). I am not sure if this can cause
problems.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants