Skip to content

[clang][CodeGen] Check initializer of zero-size fields for nullptr #109271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 20, 2024

Conversation

Michael137
Copy link
Member

In #96422 we started treating empty records as zero-sized for the purpose of layout. In C, empty fields were never considered isZeroSize, so we would never have tried to call Init->hasSideEffects on them. But since #96422 we can get here when compiling C, but Init need not exist. This patch adds a null-check to account for this situtation.

llvm#96422 changed treats empty
records as zero-sized for the purpose of layout. In `C`, empty fields
were never considered `isZeroSize`, so we would never have tried to
call `Init->hasSideEffects` on them. But since llvm#96422
we can get here when compiling `C`, but the `Init` need to exist. This
patch adds a null-check to account for this situtation.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Sep 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 19, 2024

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Michael Buch (Michael137)

Changes

In #96422 we started treating empty records as zero-sized for the purpose of layout. In C, empty fields were never considered isZeroSize, so we would never have tried to call Init->hasSideEffects on them. But since #96422 we can get here when compiling C, but Init need not exist. This patch adds a null-check to account for this situtation.


Full diff: https://github.com/llvm/llvm-project/pull/109271.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGExprConstant.cpp (+1-1)
  • (added) clang/test/CodeGenCXX/union-empty-field-init.c (+11)
diff --git a/clang/lib/CodeGen/CGExprConstant.cpp b/clang/lib/CodeGen/CGExprConstant.cpp
index f22321f0e738a1..dd65080a840446 100644
--- a/clang/lib/CodeGen/CGExprConstant.cpp
+++ b/clang/lib/CodeGen/CGExprConstant.cpp
@@ -738,7 +738,7 @@ bool ConstStructBuilder::Build(const InitListExpr *ILE, bool AllowOverwrite) {
     // Zero-sized fields are not emitted, but their initializers may still
     // prevent emission of this struct as a constant.
     if (isEmptyFieldForLayout(CGM.getContext(), Field)) {
-      if (Init->HasSideEffects(CGM.getContext()))
+      if (Init && Init->HasSideEffects(CGM.getContext()))
         return false;
       continue;
     }
diff --git a/clang/test/CodeGenCXX/union-empty-field-init.c b/clang/test/CodeGenCXX/union-empty-field-init.c
new file mode 100644
index 00000000000000..1ca8d84473e781
--- /dev/null
+++ b/clang/test/CodeGenCXX/union-empty-field-init.c
@@ -0,0 +1,11 @@
+// RUN: %clang_cc1 %s -emit-llvm -triple x86_64-linux-gnu -o - | FileCheck %s --check-prefixes=CHECK
+// RUN: %clang_cc1 -x c++ %s -emit-llvm -triple x86_64-linux-gnu -o - | FileCheck %s --check-prefixes=CHECK-CXX
+
+union Foo {
+  struct Empty {} val;
+};
+
+union Foo foo = {};
+
+// CHECK: @foo = {{.*}}global %union.Foo undef, align 1
+// CHECK-CXX: @foo = {{.*}}global %union.Foo undef, align 1


union Foo foo = {};

// CHECK: @foo = {{.*}}global %union.Foo undef, align 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this seem reasonable to you @efriedma-quic? Essentially this is treated as if we had written:

union Foo {
  [[no_unique_address]] struct Empty {} val;
};

union Foo foo = {};

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated code seems okay. I mean, it's the same thing we've always generated for similar constructs. I'd prefer to integrate this into some existing codegen test if we can, though...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved it to union-init2.c. Looked like an appropriate place


union Foo foo = {};

// CHECK: @foo = {{.*}}global %union.Foo undef, align 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated code seems okay. I mean, it's the same thing we've always generated for similar constructs. I'd prefer to integrate this into some existing codegen test if we can, though...

@@ -738,7 +738,7 @@ bool ConstStructBuilder::Build(const InitListExpr *ILE, bool AllowOverwrite) {
// Zero-sized fields are not emitted, but their initializers may still
// prevent emission of this struct as a constant.
if (isEmptyFieldForLayout(CGM.getContext(), Field)) {
if (Init->HasSideEffects(CGM.getContext()))
if (Init && Init->HasSideEffects(CGM.getContext()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should look at making InitListExpr handling more consistent, though... it looks like we generate an implicit initializer expression for C++ classes, but not other cases. But this should do the right thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea agreed, wasn't sure if that was an intentional choice or not. I'll take a closer look re. InitListExpr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where C and C++ diverge (in InitListChecker::PerformEmptyInit):

  // C++ [dcl.init.aggr]p7:
  //   If there are fewer initializer-clauses in the list than there are
  //   members in the aggregate, then each member not explicitly initialized
  //   ...
  bool EmptyInitList = SemaRef.getLangOpts().CPlusPlus11 &&
      Entity.getType()->getBaseElementTypeUnsafe()->isRecordType();
  if (EmptyInitList) {
    // C++1y / DR1070:
    //   shall be initialized [...] from an empty initializer list.
    //
    // We apply the resolution of this DR to C++11 but not C++98, since C++98
    // does not have useful semantics for initialization from an init list.
    // We treat this as copy-initialization, because aggregate initialization
    // always performs copy-initialization on its elements.
    //
    // Only do this if we're initializing a class type, to avoid filling in
    // the initializer list where possible.
    InitExpr = VerifyOnly
                   ? &DummyInitList
                   : new (SemaRef.Context)
                         InitListExpr(SemaRef.Context, Loc, std::nullopt, Loc);
    InitExpr->setType(SemaRef.Context.VoidTy);
    SubInit = InitExpr;
    Kind = InitializationKind::CreateCopy(Loc, Loc);
  } else {
    // C++03:
    //   shall be value-initialized.
  }

Technically this was only allowed as a GNU extension in C until C23 AFAIU (and in C++ since C++11).

Should we allow this for SemaRef.getLangOpts().C23 too perhaps?

@AaronBallman

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow this for SemaRef.getLangOpts().C23 too perhaps?

Empty initialization ends up in C's "default initialization" rules, which does:

If an object that has automatic storage duration is not initialized explicitly, its representation is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, or any object is initialized with an empty initializer, then it is subject to default initialization, which initializes an object as follows:
— if it has pointer type, it is initialized to a null pointer;
— if it has decimal floating type, it is initialized to positive zero, and the quantum exponent is
implementation-defined;
— if it has arithmetic type, and it does not have decimal floating type, it is initialized to (positive
or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits.

So yeah, I think we should do this in C23 mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming. I'll put up a PR for that separately. Though we'll still have to go ahead with this null check, for the non-C23 case (unless we want to extend this logic for the GNU extension too?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check GCC's behavior and follow their lead since it's a GNU extension, but my hope is that we can extend the logic for the GNU extension as well (I think it's more intuitive).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the comments ("Only do this if we're initializing a class type, to avoid filling in the initializer list where possible"), I guess the member init is being intentionally omitted to try to reduce the size of the AST... but for C++11 classes, we need to do overload resolution etc., so we can't skip it.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Michael137 Michael137 merged commit 2162a18 into llvm:main Sep 20, 2024
8 checks passed
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Sep 20, 2024
…lvm#109271)

In llvm#96422 we started treating
empty records as zero-sized for the purpose of layout. In `C`, empty
fields were never considered `isZeroSize`, so we would never have tried
to call `Init->hasSideEffects` on them. But since
llvm#96422 we can get here when
compiling `C`, but `Init` need not exist. This patch adds a null-check
to account for this situtation.

(cherry picked from commit 2162a18)
@Michael137 Michael137 deleted the bugfix/empty-union-field-init-crash branch September 20, 2024 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants