-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang][ASTImporter] Improve structural equivalence of overloadable operators. #72242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][ASTImporter] Improve structural equivalence of overloadable operators. #72242
Conversation
…perators. Operators that are overloadable may be parsed as `CXXOperatorCallExpr` or as `UnaryOperator` (or `BinaryOperator`). This depends on the context and can be different if a similar construct is imported into an existing AST. The two "forms" of the operator call AST nodes should be detected as equivalent to allow AST import of these cases. This fix has probably other consequences because if a structure is imported that has `CXXOperatorCallExpr` into an AST with an existing similar structure that has `UnaryOperator` (or binary), the additional data in the `CXXOperatorCallExpr` node is lost at the import (because the existing node will be used). I am not sure if this can cause problems.
@llvm/pr-subscribers-clang Author: Balázs Kéri (balazske) ChangesOperators that are overloadable may be parsed as This fix has probably other consequences because if a structure is imported that has Full diff: https://github.com/llvm/llvm-project/pull/72242.diff 2 Files Affected:
diff --git a/clang/lib/AST/ASTStructuralEquivalence.cpp b/clang/lib/AST/ASTStructuralEquivalence.cpp
index 6bb4bf14b873d7b..b937ff0579ca02d 100644
--- a/clang/lib/AST/ASTStructuralEquivalence.cpp
+++ b/clang/lib/AST/ASTStructuralEquivalence.cpp
@@ -98,6 +98,8 @@ static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
QualType T1, QualType T2);
static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
Decl *D1, Decl *D2);
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const Stmt *S1, const Stmt *S2);
static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
const TemplateArgument &Arg1,
const TemplateArgument &Arg2);
@@ -437,12 +439,67 @@ class StmtComparer {
};
} // namespace
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const UnaryOperator *E1,
+ const CXXOperatorCallExpr *E2) {
+ return UnaryOperator::getOverloadedOperator(E1->getOpcode()) ==
+ E2->getOperator() &&
+ IsStructurallyEquivalent(Context, E1->getSubExpr(), E2->getArg(0));
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const CXXOperatorCallExpr *E1,
+ const UnaryOperator *E2) {
+ return E1->getOperator() ==
+ UnaryOperator::getOverloadedOperator(E2->getOpcode()) &&
+ IsStructurallyEquivalent(Context, E1->getArg(0), E2->getSubExpr());
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const BinaryOperator *E1,
+ const CXXOperatorCallExpr *E2) {
+ return BinaryOperator::getOverloadedOperator(E1->getOpcode()) ==
+ E2->getOperator() &&
+ IsStructurallyEquivalent(Context, E1->getLHS(), E2->getArg(0)) &&
+ IsStructurallyEquivalent(Context, E1->getRHS(), E2->getArg(1));
+}
+
+static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
+ const CXXOperatorCallExpr *E1,
+ const BinaryOperator *E2) {
+ return E1->getOperator() ==
+ BinaryOperator::getOverloadedOperator(E2->getOpcode()) &&
+ IsStructurallyEquivalent(Context, E1->getArg(0), E2->getLHS()) &&
+ IsStructurallyEquivalent(Context, E1->getArg(1), E2->getRHS());
+}
+
/// Determine structural equivalence of two statements.
static bool IsStructurallyEquivalent(StructuralEquivalenceContext &Context,
const Stmt *S1, const Stmt *S2) {
if (!S1 || !S2)
return S1 == S2;
+ // Check for statements with similar syntax but different AST.
+ // The unary and binary operators (like '+', '&') can be parsed as
+ // CXXOperatorCall too (and UnaryOperator or BinaryOperator).
+ // This depends on arguments with unresolved type and on the name lookup.
+ // The lookup results can be different in a "From" and "To" AST even if the
+ // compared structure is otherwise equivalent. For this reason we must treat
+ // these operators as equivalent even if they are represented by different AST
+ // nodes.
+ if (const auto *E2CXXOperatorCall = dyn_cast<CXXOperatorCallExpr>(S2)) {
+ if (const auto *E1Unary = dyn_cast<UnaryOperator>(S1))
+ return IsStructurallyEquivalent(Context, E1Unary, E2CXXOperatorCall);
+ if (const auto *E1Binary = dyn_cast<BinaryOperator>(S1))
+ return IsStructurallyEquivalent(Context, E1Binary, E2CXXOperatorCall);
+ }
+ if (const auto *E1CXXOperatorCall = dyn_cast<CXXOperatorCallExpr>(S1)) {
+ if (const auto *E2Unary = dyn_cast<UnaryOperator>(S2))
+ return IsStructurallyEquivalent(Context, E1CXXOperatorCall, E2Unary);
+ if (const auto *E2Binary = dyn_cast<BinaryOperator>(S2))
+ return IsStructurallyEquivalent(Context, E1CXXOperatorCall, E2Binary);
+ }
+
// Compare the statements itself.
StmtComparer Comparer(Context);
if (!Comparer.IsEquivalent(S1, S2))
diff --git a/clang/unittests/AST/StructuralEquivalenceTest.cpp b/clang/unittests/AST/StructuralEquivalenceTest.cpp
index 44d950cfe758f14..539e3a4a2c01dca 100644
--- a/clang/unittests/AST/StructuralEquivalenceTest.cpp
+++ b/clang/unittests/AST/StructuralEquivalenceTest.cpp
@@ -2252,6 +2252,176 @@ TEST_F(StructuralEquivalenceStmtTest, UnaryOperatorDifferentOps) {
EXPECT_FALSE(testStructuralMatch(t));
}
+TEST_F(StructuralEquivalenceStmtTest,
+ CXXOperatorCallExprVsUnaryBinaryOperator) {
+ auto t = makeNamedDecls(
+ R"(
+ template <typename T, T x>
+ class A;
+ template <typename T, T x, T y>
+ void foo(
+ A<T, x + y>,
+ A<T, x - y>,
+ A<T, -x>,
+ A<T, x * y>,
+ A<T, *x>,
+ A<T, x / y>,
+ A<T, x % y>,
+ A<T, x ^ y>,
+ A<T, x & y>,
+ A<T, &x>,
+ A<T, x | y>,
+ A<T, ~x>,
+ A<T, !x>,
+ A<T, x < y>,
+ A<T, (x > y)>,
+ A<T, x << y>,
+ A<T, (x >> y)>,
+ A<T, x == y>,
+ A<T, x != y>,
+ A<T, x <= y>,
+ A<T, x >= y>,
+ A<T, x <=> y>,
+ A<T, x && y>,
+ A<T, x || y>,
+ A<T, ++x>,
+ A<T, --x>,
+ A<T, (x , y)>,
+ A<T, x ->* y>,
+ A<T, x -> y>
+ );
+ )",
+ R"(
+ struct Bar {
+ Bar& operator=(Bar&);
+ Bar& operator->();
+ };
+
+ Bar& operator+(Bar&, Bar&);
+ Bar& operator+(Bar&);
+ Bar& operator-(Bar&, Bar&);
+ Bar& operator-(Bar&);
+ Bar& operator*(Bar&, Bar&);
+ Bar& operator*(Bar&);
+ Bar& operator/(Bar&, Bar&);
+ Bar& operator%(Bar&, Bar&);
+ Bar& operator^(Bar&, Bar&);
+ Bar& operator&(Bar&, Bar&);
+ Bar& operator&(Bar&);
+ Bar& operator|(Bar&, Bar&);
+ Bar& operator~(Bar&);
+ Bar& operator!(Bar&);
+ Bar& operator<(Bar&, Bar&);
+ Bar& operator>(Bar&, Bar&);
+ Bar& operator+=(Bar&, Bar&);
+ Bar& operator-=(Bar&, Bar&);
+ Bar& operator*=(Bar&, Bar&);
+ Bar& operator/=(Bar&, Bar&);
+ Bar& operator%=(Bar&, Bar&);
+ Bar& operator^=(Bar&, Bar&);
+ Bar& operator&=(Bar&, Bar&);
+ Bar& operator|=(Bar&, Bar&);
+ Bar& operator<<(Bar&, Bar&);
+ Bar& operator>>(Bar&, Bar&);
+ Bar& operator<<=(Bar&, Bar&);
+ Bar& operator>>=(Bar&, Bar&);
+ Bar& operator==(Bar&, Bar&);
+ Bar& operator!=(Bar&, Bar&);
+ Bar& operator<=(Bar&, Bar&);
+ Bar& operator>=(Bar&, Bar&);
+ Bar& operator<=>(Bar&, Bar&);
+ Bar& operator&&(Bar&, Bar&);
+ Bar& operator||(Bar&, Bar&);
+ Bar& operator++(Bar&);
+ Bar& operator--(Bar&);
+ Bar& operator,(Bar&, Bar&);
+ Bar& operator->*(Bar&, Bar&);
+
+ template <typename T, T x>
+ class A;
+ template <typename T, T x, T y>
+ void foo(
+ A<T, x + y>,
+ A<T, x - y>,
+ A<T, -x>,
+ A<T, x * y>,
+ A<T, *x>,
+ A<T, x / y>,
+ A<T, x % y>,
+ A<T, x ^ y>,
+ A<T, x & y>,
+ A<T, &x>,
+ A<T, x | y>,
+ A<T, ~x>,
+ A<T, !x>,
+ A<T, x < y>,
+ A<T, (x > y)>,
+ A<T, x << y>,
+ A<T, (x >> y)>,
+ A<T, x == y>,
+ A<T, x != y>,
+ A<T, x <= y>,
+ A<T, x >= y>,
+ A<T, x <=> y>,
+ A<T, x && y>,
+ A<T, x || y>,
+ A<T, ++x>,
+ A<T, --x>,
+ A<T, (x , y)>,
+ A<T, x ->* y>,
+ A<T, x -> y>
+ );
+ )",
+ Lang_CXX20);
+ EXPECT_TRUE(testStructuralMatch(t));
+}
+
+TEST_F(StructuralEquivalenceStmtTest,
+ CXXOperatorCallExprVsUnaryBinaryOperatorNe) {
+ auto t = makeNamedDecls(
+ R"(
+ template <typename T, T x>
+ class A;
+ template <typename T, T x, T y>
+ void foo(
+ A<T, x + y>
+ );
+ )",
+ R"(
+ struct Bar;
+
+ Bar& operator+(Bar&, Bar&);
+
+ template <typename T, T x>
+ class A;
+ template <typename T, T x, T y>
+ void foo(
+ A<T, x - y>
+ );
+ )",
+ Lang_CXX11);
+ EXPECT_FALSE(testStructuralMatch(t));
+}
+
+TEST_F(StructuralEquivalenceStmtTest, NonTypeTemplateParm) {
+ auto t = makeNamedDecls(
+ R"(
+ template <typename T, T x>
+ class A;
+ template <typename T, T x, T y>
+ void foo(A<T, x>);
+ )",
+ R"(
+ template <typename T, T x>
+ class A;
+ template <typename T, T x, T y>
+ void foo(A<T, y>);
+ )",
+ Lang_CXX11);
+ // FIXME: These should not match,
+ EXPECT_TRUE(testStructuralMatch(t));
+}
+
TEST_F(StructuralEquivalenceStmtTest, UnresolvedLookupDifferentName) {
auto t = makeStmts(
R"(
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you write that unifying these different but equivalent expressions could've unintended consequences, perhaps you should check the effects of this change on CTU analysis of modern (a.k.a. crazy) C++ projects (or publish the results if you already did so).
Otherwise I can't find any problem in this code, but I'm not familiar enough with these intricate details of the standard, so this opinion is not too relevant.
R"( | ||
struct Bar; | ||
|
||
Bar& operator+(Bar&, Bar&); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the presence of this operator+
relevant if this TU doesn't use operator+
at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know exactly what was the purpose, but likely it should be operator -
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code is now updated to operator-
(the first code should contain an CXXOperatorCallExpr
, the second a BinaryOperator
in the template arg 2 of A
).
// Check for statements with similar syntax but different AST. | ||
// The unary and binary operators (like '+', '&') can be parsed as | ||
// CXXOperatorCall too (and UnaryOperator or BinaryOperator). | ||
// This depends on arguments with unresolved type and on the name lookup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you perhaps elaborate this sentence a bit? Do I understand it correctly that e.g. in CXXOperatorCallExprVsUnaryBinaryOperator
the presence of struct Bar
in the second TU ensured that all the operators are parsed as CXXOperatorCall
s, while in the first TU they are parsed as unary/binary operator expressions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was coming from this post:
https://discourse.llvm.org/t/different-ast-translations-for-the-same-body-why/22741
The presence of an operator declaration is needed to change the parsed AST from BinaryOperator
to CXXOverloadedOperatorCallExpr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the link, it's really helpful. Perhaps it should be included in the source code comment as well, to help those who'll work on this code in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the updated comment is clear now.
I have updated the comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarifications, it looks good to me now. (Except for the FIXME
, but that's an independent issue and you're right that it should be handled in another commit.)
…perators. (llvm#72242) Operators that are overloadable may be parsed as `CXXOperatorCallExpr` or as `UnaryOperator` (or `BinaryOperator`). This depends on the context and can be different if a similar construct is imported into an existing AST. The two "forms" of the operator call AST nodes should be detected as equivalent to allow AST import of these cases. This fix has probably other consequences because if a structure is imported that has `CXXOperatorCallExpr` into an AST with an existing similar structure that has `UnaryOperator` (or binary), the additional data in the `CXXOperatorCallExpr` node is lost at the import (because the existing node will be used). I am not sure if this can cause problems.
Operators that are overloadable may be parsed as
CXXOperatorCallExpr
or asUnaryOperator
(orBinaryOperator
). This depends on the context and can be different if a similar construct is imported into an existing AST. The two "forms" of the operator call AST nodes should be detected as equivalent to allow AST import of these cases.This fix has probably other consequences because if a structure is imported that has
CXXOperatorCallExpr
into an AST with an existing similar structure that hasUnaryOperator
(or binary), the additional data in theCXXOperatorCallExpr
node is lost at the import (because the existing node will be used). I am not sure if this can cause problems.