Skip to content

[Clang][Sema] Fix malformed AST for anonymous class access in template. #90842

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 14, 2024

Conversation

martinboehme
Copy link
Contributor

@martinboehme martinboehme commented May 2, 2024

Observed erroneous behavior

Prior to this change, a MemberExpr that accesses an anonymous class might have a prvalue as its base (even though C++ mandates that the base of a MemberExpr must be a glvalue), if the code containing the MemberExpr was in a template.

Here's an example on godbolt (that is essentially identical to the new test this patch adds).

This example sets up a struct containing an anonymous struct:

struct S {
  struct {
    int i;
  };
};

It then accesses the member i using the expression S().i.

When we do this in a non-template function, we get the following AST:

`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
        `-MaterializeTemporaryExpr <col:10, col:12> 'S' xvalue
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing

As expected, the AST contains a MaterializeTemporarExpr to materialize the prvalue S() before accessing its members.

When we perform this access in a function template (that doesn't actually even use its template parameter), the AST for the template itself looks the same as above. However, the AST for an instantiation of the template looks different:

`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MaterializeTemporaryExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue
        `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' .S::(anonymous struct at line:2:3) 0xbdcb488
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing

Note how the inner MemberExpr (the one accessing the anonymous struct) acts on a prvalue.

Interestingly, this does not appear to cause any problems for CodeGen, probably because CodeGen is set up to deal with MemberExprs on rvalues in C. However, it does cause issues in the dataflow framework, which only supports C++ today and expects the base of a MemberExpr to be a glvalue.

Beyond the issues with the dataflow framework, I think this issue should be fixed because it goes contrary to what the C++ standard mandates, and the AST produced for the non-template case indicates that we want to follow the C++ rules here.

Reasons for erroneous behavior

Here's why we're getting this malformed AST.

First of all, TreeTransform strips any MaterializeTemporaryExprs from the AST.

It is therefore up to TreeTransform::RebuildMemberExpr() to recreate a MaterializeTemporaryExpr if needed. In the general case, it does this: It calls Sema::BuildMemberReferenceExpr(), which ensures that the base is a glvalue by materializing a temporary if needed. However, when TreeTransform::RebuildMemberExpr() encounters an anonymous class, it calls Sema::BuildFieldReferenceExpr(), which, unlike Sema::BuildMemberReferenceExpr(), does not make sure that the base is a glvalue.

Proposed fix

I considered several possible ways to fix this issue:

  • Add logic to Sema::BuildFieldReferenceExpr() that materializes a temporary if needed. This appears to work, but it feels like the fix is in the wrong place:

    • AFAIU, other callers of Sema::BuildFieldReferenceExpr() don't need this logic.
    • The issue is caused by TreeTransform removing the MaterializeTemporaryExpr, so it seems the fix should also be in TreeTransform
  • Materialize the temporary directly in TreeTransform::RebuildMemberExpr() if needed (within the case that deals with anonymous classes).

    This would work, too, but it would duplicate logic that already exists in Sema::BuildMemberReferenceExpr() (which we leverage for the general case).

  • Use Sema::BuildMemberReferenceExpr() instead of Sema::BuildFieldReferenceExpr() for the anonymous class case, so that it also uses the existing logic for materializing the temporary.

    This is the option I've decided to go with here. There's a slight wrinkle in that we create a LookupResult that claims we looked up the unnamed field for the anonymous class -- even though we would obviously never be able to look up an unnamed field. I think this is defensible and still better than the other alternatives, but I would welcome feedback on this from others who know the code better.

# Observed erroneous behavior

Prior to this change, a `MemberExpr` that accesses an anonymous class might have
a prvalue as its base (even though C++ mandates that the base of a `MemberExpr`
must be a glvalue), if the code containing the `MemberExpr` was in a template.

Here's an example on [godbolt](https://godbolt.org/z/Gz1Mer9oz) (that is
essentially identical to the new test this patch adds).

This example sets up a struct containing an anonymous struct:

```cxx
struct S {
  struct {
    int i;
  };
};
```

It then accesses the member `i` using the expression `S().i`.

When we do this in a non-template function, we get the following AST:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
        `-MaterializeTemporaryExpr <col:10, col:12> 'S' xvalue
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

As expected, the AST contains a `MaterializeTemporarExpr` to materialize the
prvalue `S()` before accessing its members.

When we perform this access in a function template (that doesn't actually even
use its template parameter), the AST for the template itself looks the same as
above. However, the AST for an instantiation of the template looks different:

```
`-ExprWithCleanups <col:10, col:14> 'int'
  `-ImplicitCastExpr <col:10, col:14> 'int' <LValueToRValue>
    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MaterializeTemporaryExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue
        `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' .S::(anonymous struct at line:2:3) 0xbdcb488
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing
```

Note how the inner `MemberExpr` (the one accessing the anonymous struct) acts on
a prvalue.

Interestingly, this does not appear to cause any problems for CodeGen, probably
because CodeGen is set up to deal with `MemberExpr`s on rvalues in C. However,
it does cause issues in the dataflow framework, which only supports C++ today
and expects the base of a `MemberExpr` to be a glvalue.

Beyond the issues with the dataflow framework, I think this issue should be
fixed because it goes contrary to what the C++ standard mandates, and the AST
produced for the non-template case indicates that we want to follow the C++
rules here.

# Reasons for erroneous behavior

Here's why we're getting this malformed AST.

First of all, `TreeTransform` [strips any
`MaterializeTemporaryExpr`s](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L14853)
from the AST.

It is therefore up to
[`TreeTransform::RebuildMemberExpr()`](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2853)
to recreate a `MaterializeTemporaryExpr` if needed. In the [general
case](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2915),
it does this: It calls `Sema::BuildMemberReferenceExpr()`, which ensures that
the base is a glvalue by
[materializing a
temporary](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/SemaExprMember.cpp#L1016)
if needed. However, when `TreeTransform::RebuildMemberExpr()` encounters an
anonymous class, it [calls
`Sema::BuildFieldReferenceExpr()`](https://github.com/llvm/llvm-project/blob/cd132dcbeb0fc79fd657bd5e0a8e9244c3fb5da6/clang/lib/Sema/TreeTransform.h#L2880),
which, unlike `Sema::BuildMemberReferenceExpr()`, does not make sure that the
base is a glvalue.

# Proposed fix

I considered several possible ways to fix this issue:

-  Add logic to `Sema::BuildFieldReferenceExpr()` that materializes a temporary
   if needed. This appears to work, but it feels like the fix is in the wrong
   place:
   - AFAIU, other callers of `Sema::BuildFieldReferenceExpr()` don't need this
     logic.
   - The issue is caused by `TreeTransform` removing the
     `MaterializeTemporaryExpr`, so it seems the fix should also be in
     `TreeTransform`

-  Materialize the temporary directly in `TreeTransform::RebuildMemberExpr()` if
   needed (within the case that deals with anonymous classes).

   This would work, too, but it would duplicate logic that already exists in
   `Sema::BuildMemberReferenceExpr()` (which we leverage for the general case).

-  Use `Sema::BuildMemberReferenceExpr()` instead of
   `Sema::BuildFieldReferenceExpr()` for the anonymous class case, so that it
   also uses the existing logic for materializing the temporary.

   This is the option I've decided to go with here. There's a slight wrinkle in
   that we create a `LookupResult` that claims we looked up the unnamed field
   for the anonymous class -- even though we would obviously never be able to
   look up an unnamed field. I think this is defensible and still better than
   the other alternatives, but I would welcome feedback on this from others who
   know the code better.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html labels May 2, 2024
@llvmbot
Copy link
Member

llvmbot commented May 2, 2024

@llvm/pr-subscribers-clang

Author: None (martinboehme)

Changes

Observed erroneous behavior

Prior to this change, a MemberExpr that accesses an anonymous class might have
a prvalue as its base (even though C++ mandates that the base of a MemberExpr
must be a glvalue), if the code containing the MemberExpr was in a template.

Here's an example on godbolt (that is
essentially identical to the new test this patch adds).

This example sets up a struct containing an anonymous struct:

struct S {
  struct {
    int i;
  };
};

It then accesses the member i using the expression S().i.

When we do this in a non-template function, we get the following AST:

`-ExprWithCleanups &lt;col:10, col:14&gt; 'int'
  `-ImplicitCastExpr &lt;col:10, col:14&gt; 'int' &lt;LValueToRValue&gt;
    `-MemberExpr &lt;col:10, col:14&gt; 'int' xvalue .i 0xbdcb3c0
      `-MemberExpr &lt;col:10, col:14&gt; 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
        `-MaterializeTemporaryExpr &lt;col:10, col:12&gt; 'S' xvalue
          `-CXXTemporaryObjectExpr &lt;col:10, col:12&gt; 'S' 'void () noexcept' zeroing

As expected, the AST contains a MaterializeTemporarExpr to materialize the
prvalue S() before accessing its members.

When we perform this access in a function template (that doesn't actually even
use its template parameter), the AST for the template itself looks the same as
above. However, the AST for an instantiation of the template looks different:

`-ExprWithCleanups &lt;col:10, col:14&gt; 'int'
  `-ImplicitCastExpr &lt;col:10, col:14&gt; 'int' &lt;LValueToRValue&gt;
    `-MemberExpr &lt;col:10, col:14&gt; 'int' xvalue .i 0xbdcb3c0
      `-MaterializeTemporaryExpr &lt;col:10, col:14&gt; 'S::(anonymous struct at line:2:3)' xvalue
        `-MemberExpr &lt;col:10, col:14&gt; 'S::(anonymous struct at line:2:3)' .S::(anonymous struct at line:2:3) 0xbdcb488
          `-CXXTemporaryObjectExpr &lt;col:10, col:12&gt; 'S' 'void () noexcept' zeroing

Note how the inner MemberExpr (the one accessing the anonymous struct) acts on
a prvalue.

Interestingly, this does not appear to cause any problems for CodeGen, probably
because CodeGen is set up to deal with MemberExprs on rvalues in C. However,
it does cause issues in the dataflow framework, which only supports C++ today
and expects the base of a MemberExpr to be a glvalue.

Beyond the issues with the dataflow framework, I think this issue should be
fixed because it goes contrary to what the C++ standard mandates, and the AST
produced for the non-template case indicates that we want to follow the C++
rules here.

Reasons for erroneous behavior

Here's why we're getting this malformed AST.

First of all, TreeTransform strips any
MaterializeTemporaryExprs

from the AST.

It is therefore up to
TreeTransform::RebuildMemberExpr()
to recreate a MaterializeTemporaryExpr if needed. In the general
case
,
it does this: It calls Sema::BuildMemberReferenceExpr(), which ensures that
the base is a glvalue by
materializing a
temporary

if needed. However, when TreeTransform::RebuildMemberExpr() encounters an
anonymous class, it calls
Sema::BuildFieldReferenceExpr()
,
which, unlike Sema::BuildMemberReferenceExpr(), does not make sure that the
base is a glvalue.

Proposed fix

I considered several possible ways to fix this issue:

  • Add logic to Sema::BuildFieldReferenceExpr() that materializes a temporary
    if needed. This appears to work, but it feels like the fix is in the wrong
    place:

    • AFAIU, other callers of Sema::BuildFieldReferenceExpr() don't need this
      logic.
    • The issue is caused by TreeTransform removing the
      MaterializeTemporaryExpr, so it seems the fix should also be in
      TreeTransform
  • Materialize the temporary directly in TreeTransform::RebuildMemberExpr() if
    needed (within the case that deals with anonymous classes).

    This would work, too, but it would duplicate logic that already exists in
    Sema::BuildMemberReferenceExpr() (which we leverage for the general case).

  • Use Sema::BuildMemberReferenceExpr() instead of
    Sema::BuildFieldReferenceExpr() for the anonymous class case, so that it
    also uses the existing logic for materializing the temporary.

    This is the option I've decided to go with here. There's a slight wrinkle in
    that we create a LookupResult that claims we looked up the unnamed field
    for the anonymous class -- even though we would obviously never be able to
    look up an unnamed field. I think this is defensible and still better than
    the other alternatives, but I would welcome feedback on this from others who
    know the code better.


Full diff: https://github.com/llvm/llvm-project/pull/90842.diff

3 Files Affected:

  • (modified) clang/lib/Sema/TreeTransform.h (+14-3)
  • (added) clang/test/AST/ast-dump-anonymous-class.cpp (+49)
  • (modified) clang/unittests/Analysis/FlowSensitive/TransferTest.cpp (+35)
diff --git a/clang/lib/Sema/TreeTransform.h b/clang/lib/Sema/TreeTransform.h
index f47bc219e6fa32..9c800c5705c7c6 100644
--- a/clang/lib/Sema/TreeTransform.h
+++ b/clang/lib/Sema/TreeTransform.h
@@ -2876,10 +2876,21 @@ class TreeTransform {
         return ExprError();
       Base = BaseResult.get();
 
+      // We want to use `BuildMemberReferenceExpr()` so we can use its logic
+      // that materializes `Base` into a temporary if it's a prvalue.
+      // To do so, we need to create a `LookupResult` for `Member`, even though
+      // it's an unnamed field (that we could never actually have looked up).
+      // This small hack seems preferable to duplicating the logic for
+      // materializing the temporary.
+      LookupResult R(getSema(), MemberNameInfo, Sema::LookupMemberName);
+      R.addDecl(Member);
+      R.resolveKind();
+
       CXXScopeSpec EmptySS;
-      return getSema().BuildFieldReferenceExpr(
-          Base, isArrow, OpLoc, EmptySS, cast<FieldDecl>(Member),
-          DeclAccessPair::make(FoundDecl, FoundDecl->getAccess()), MemberNameInfo);
+      return getSema().BuildMemberReferenceExpr(
+          Base, Base->getType(), OpLoc, isArrow, EmptySS, TemplateKWLoc,
+          FirstQualifierInScope, R, ExplicitTemplateArgs,
+          /*S*/ nullptr);
     }
 
     CXXScopeSpec SS;
diff --git a/clang/test/AST/ast-dump-anonymous-class.cpp b/clang/test/AST/ast-dump-anonymous-class.cpp
new file mode 100644
index 00000000000000..393c084c913d15
--- /dev/null
+++ b/clang/test/AST/ast-dump-anonymous-class.cpp
@@ -0,0 +1,49 @@
+// RUN: %clang_cc1 -std=c++17 -triple x86_64-unknown-unknown -ast-dump %s \
+// RUN: | FileCheck -strict-whitespace %s
+
+struct S {
+  struct {
+    int i;
+  };
+};
+
+int accessInRegularFunction() {
+  return S().i;
+  // CHECK: FunctionDecl {{.*}} accessInRegularFunction 'int ()'
+  // CHECK:      |   `-ReturnStmt {{.*}}
+  // CHECK-NEXT: |     `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT: |       `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT: |         `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT: |           `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT: |             `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT: |               `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing
+}
+
+// AST should look the same in a function template with an unused template
+// parameter.
+template <class>
+int accessInFunctionTemplate() {
+  return S().i;
+  // CHECK: FunctionDecl {{.*}} accessInFunctionTemplate 'int ()'
+  // CHECK:      |   `-ReturnStmt {{.*}}
+  // CHECK-NEXT: |     `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT: |       `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT: |         `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT: |           `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT: |             `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT: |               `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing
+}
+
+// AST should look the same in an instantiation of the function template.
+// This is a regression test: The AST used to contain the
+// `MaterializeTemporaryExpr` in the wrong place, causing a `MemberExpr` to have
+// a prvalue base (which is not allowed in C++).
+template int accessInFunctionTemplate<int>();
+  // CHECK: FunctionDecl {{.*}} accessInFunctionTemplate 'int ()' explicit_instantiation_definition
+  // CHECK:          `-ReturnStmt {{.*}}
+  // CHECK-NEXT:       `-ExprWithCleanups {{.*}} 'int'
+  // CHECK-NEXT:         `-ImplicitCastExpr {{.*}} 'int' <LValueToRValue>
+  // CHECK-NEXT:           `-MemberExpr {{.*}} 'int' xvalue .i
+  // CHECK-NEXT:             `-MemberExpr {{.*}} 'S::(anonymous struct at {{.*}})
+  // CHECK-NEXT:               `-MaterializeTemporaryExpr {{.*}} 'S' xvalue
+  // CHECK-NEXT:                 `-CXXTemporaryObjectExpr {{.*}} 'S' 'void () noexcept' zeroing
diff --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index 301bec32c0cf1d..3f68adaf1c62d2 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -27,6 +27,14 @@
 #include <string>
 #include <utility>
 
+namespace clang {
+namespace dataflow {
+namespace {
+AST_MATCHER(FunctionDecl, isTemplated) { return Node.isTemplated(); }
+} // namespace
+} // namespace dataflow
+} // namespace clang
+
 namespace {
 
 using namespace clang;
@@ -7205,4 +7213,31 @@ TEST(TransferTest, ConditionalRelation) {
       });
 }
 
+// This is a crash repro.
+// We used to crash while transferring `S().i` because Clang contained a bug
+// causing the AST to be malformed.
+TEST(TransferTest, AnonymousUnionMemberExprInTemplate) {
+  using ast_matchers::functionDecl;
+  using ast_matchers::hasName;
+  using ast_matchers::unless;
+
+  std::string Code = R"cc(
+    struct S {
+      struct {
+        int i;
+      };
+    };
+
+    template <class>
+    void target() {
+        S().i;
+    }
+
+    template void target<int>();
+  )cc";
+  auto Matcher = functionDecl(hasName("target"), unless(isTemplated()));
+  ASSERT_THAT_ERROR(checkDataflowWithNoopAnalysis(Code, Matcher),
+                    llvm::Succeeded());
+}
+
 } // namespace

@martinboehme martinboehme requested a review from AaronBallman May 2, 2024 10:15
@martinboehme
Copy link
Contributor Author

@AaronBallman You're probably not the right person to review this yourself, but I'm told you probably have an idea of who might be an appropriate reviewer?

Copy link
Collaborator

@shafik shafik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this fix.

You should reference the issue that this fixes in your summary.

This also need a release note.

return getSema().BuildMemberReferenceExpr(
Base, Base->getType(), OpLoc, isArrow, EmptySS, TemplateKWLoc,
FirstQualifierInScope, R, ExplicitTemplateArgs,
/*S*/ nullptr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*S*/ nullptr);
/*S=*/nullptr);

We use bugprone-argument-comment format for these.

@erichkeane
Copy link
Collaborator

This is the option I've decided to go with here. There's a slight wrinkle in that we create a LookupResult that claims we looked up the unnamed field for the anonymous class -- even though we would obviously never be able to look up an unnamed field. I think this is defensible and still better than the other alternatives, but I would welcome feedback on this from others who know the code better.

I actually prefer option#2. This doesn't really do what it should, so I think we should just make sure we materialize the temporary if necessary in RebuildMemberExpr.

@shafik
Copy link
Collaborator

shafik commented May 2, 2024

Note there is a BuildAnonymousStructUnionMemberReference, I am not sure it solves your problem.

Copy link
Collaborator

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility to consider: when transforming a member access, strip off any implicit member accesses from the base expression before transforming the base. That way, we'll rebuild the implicit member access from scratch, which might be a little more work, but should get details like this one right. (We can then probably also remove the logic to deal with an absent DeclName in RebuildMemberExpr since that shouldn't happen any more.)

@martinboehme
Copy link
Contributor Author

Thank you for this fix.

You should reference the issue that this fixes in your summary.

There isn't currently a github issue for this. Should I create one?

This also need a release note.

Added. How does this look? (Is it in the appropriate section?)

@martinboehme
Copy link
Contributor Author

This is the option I've decided to go with here. There's a slight wrinkle in that we create a LookupResult that claims we looked up the unnamed field for the anonymous class -- even though we would obviously never be able to look up an unnamed field. I think this is defensible and still better than the other alternatives, but I would welcome feedback on this from others who know the code better.

I actually prefer option#2. This doesn't really do what it should, so I think we should just make sure we materialize the temporary if necessary in RebuildMemberExpr.

Done. WDYT?

@martinboehme
Copy link
Contributor Author

Another possibility to consider: when transforming a member access, strip off any implicit member accesses from the base expression before transforming the base. That way, we'll rebuild the implicit member access from scratch, which might be a little more work, but should get details like this one right. (We can then probably also remove the logic to deal with an absent DeclName in RebuildMemberExpr since that shouldn't happen any more.)

This sounds enticing, but after experimenting with this a bit, I’ve concluded that it’s unfortunately not so simple to get this working (unless there’s something I’m overlooking).

Looking at the relevant part of the AST:

    `-MemberExpr <col:10, col:14> 'int' xvalue .i 0xbdcb3c0
      `-MemberExpr <col:10, col:14> 'S::(anonymous struct at line:2:3)' xvalue .S::(anonymous struct at line:2:3) 0xbdcb488
        `-MaterializeTemporaryExpr <col:10, col:12> 'S' xvalue
          `-CXXTemporaryObjectExpr <col:10, col:12> 'S' 'void () noexcept' zeroing

What I think you’re suggesting is that while processing the base for the outer MemberExpr (the one for .i), we strip the inner MemberExpr (the one for the anonymous field). The idea would be that the MemberExpr for the anonymous field should get recreated within Sema::BuildMemberReferenceExpr() when it’s called with the field declaration for i.

This would work if we handed an IndirectFieldDecl to BuildMemberReferenceExpr(), but the problem is that we have no easy way of getting to the IndirectFieldDecl. The getMemberDecl() for the outer MemberExpr is a plain FieldDecl; the getFoundDecl() is, too.

The getFoundDecl() for the inner MemberExpr does refer to the IndirectFieldDecl we want. So we could try to retain only this MemberExpr and strip the others – but then we run into the issue that there may not even be an inner MemberExpr if the anonymous union is declared inside a function rather than inside a class (example).

My feeling is that rather than trying to deal with all of the various cases involved in stripping, then recreating the implicit MemberExprs, we retain them and merely recreate the MaterializeTemporaryExpr if needed.

@martinboehme
Copy link
Contributor Author

Note there is a BuildAnonymousStructUnionMemberReference, I am not sure it solves your problem.

Thanks. The issue with this is that it requires an IndirectFieldDecl to call it, and that's hard to get at -- see also my response to @zygoloid.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems simple enough, and the AST output looks reasonable.

@martinboehme
Copy link
Contributor Author

@zygoloid @shafik Any additional comments here, or is this good to go?

Copy link
Collaborator

@zygoloid zygoloid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing further on my side, LGTM

@martinboehme
Copy link
Contributor Author

CI failure (DataFlowSanitizer-x86_64 :: release_shadow_space.c) looks unrelated.

Will merge now. @shafik If you have any remaining comments, happy to address those in a followup PR.

@martinboehme martinboehme merged commit fcd020d into llvm:main May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants