Skip to content

[clang][CodeGen] Check initializer of zero-size fields for nullptr #109271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion clang/lib/CodeGen/CGExprConstant.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -738,7 +738,7 @@ bool ConstStructBuilder::Build(const InitListExpr *ILE, bool AllowOverwrite) {
// Zero-sized fields are not emitted, but their initializers may still
// prevent emission of this struct as a constant.
if (isEmptyFieldForLayout(CGM.getContext(), Field)) {
if (Init->HasSideEffects(CGM.getContext()))
if (Init && Init->HasSideEffects(CGM.getContext()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should look at making InitListExpr handling more consistent, though... it looks like we generate an implicit initializer expression for C++ classes, but not other cases. But this should do the right thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea agreed, wasn't sure if that was an intentional choice or not. I'll take a closer look re. InitListExpr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where C and C++ diverge (in InitListChecker::PerformEmptyInit):

  // C++ [dcl.init.aggr]p7:
  //   If there are fewer initializer-clauses in the list than there are
  //   members in the aggregate, then each member not explicitly initialized
  //   ...
  bool EmptyInitList = SemaRef.getLangOpts().CPlusPlus11 &&
      Entity.getType()->getBaseElementTypeUnsafe()->isRecordType();
  if (EmptyInitList) {
    // C++1y / DR1070:
    //   shall be initialized [...] from an empty initializer list.
    //
    // We apply the resolution of this DR to C++11 but not C++98, since C++98
    // does not have useful semantics for initialization from an init list.
    // We treat this as copy-initialization, because aggregate initialization
    // always performs copy-initialization on its elements.
    //
    // Only do this if we're initializing a class type, to avoid filling in
    // the initializer list where possible.
    InitExpr = VerifyOnly
                   ? &DummyInitList
                   : new (SemaRef.Context)
                         InitListExpr(SemaRef.Context, Loc, std::nullopt, Loc);
    InitExpr->setType(SemaRef.Context.VoidTy);
    SubInit = InitExpr;
    Kind = InitializationKind::CreateCopy(Loc, Loc);
  } else {
    // C++03:
    //   shall be value-initialized.
  }

Technically this was only allowed as a GNU extension in C until C23 AFAIU (and in C++ since C++11).

Should we allow this for SemaRef.getLangOpts().C23 too perhaps?

@AaronBallman

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we allow this for SemaRef.getLangOpts().C23 too perhaps?

Empty initialization ends up in C's "default initialization" rules, which does:

If an object that has automatic storage duration is not initialized explicitly, its representation is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, or any object is initialized with an empty initializer, then it is subject to default initialization, which initializes an object as follows:
— if it has pointer type, it is initialized to a null pointer;
— if it has decimal floating type, it is initialized to positive zero, and the quantum exponent is
implementation-defined;
— if it has arithmetic type, and it does not have decimal floating type, it is initialized to (positive
or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits.

So yeah, I think we should do this in C23 mode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for confirming. I'll put up a PR for that separately. Though we'll still have to go ahead with this null check, for the non-C23 case (unless we want to extend this logic for the GNU extension too?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check GCC's behavior and follow their lead since it's a GNU extension, but my hope is that we can extend the logic for the GNU extension as well (I think it's more intuitive).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the comments ("Only do this if we're initializing a class type, to avoid filling in the initializer list where possible"), I guess the member init is being intentionally omitted to try to reduce the size of the AST... but for C++11 classes, we need to do overload resolution etc., so we can't skip it.

return false;
continue;
}
Expand Down
8 changes: 8 additions & 0 deletions clang/test/CodeGen/union-init2.c
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
// RUN: %clang_cc1 -emit-llvm %s -o - -triple i686-pc-linux-gnu | FileCheck %s
// RUN: %clang_cc1 -x c++ %s -emit-llvm -triple x86_64-linux-gnu -o - | FileCheck %s --check-prefixes=CHECK-CXX

// Make sure we generate something sane instead of a ptrtoint
// CHECK: @r, [4 x i8] undef
Expand All @@ -11,3 +12,10 @@ union z {
long long b;
};
union z y = {};

// CHECK: @foo = {{.*}}global %union.Foo undef, align 1
// CHECK-CXX: @foo = {{.*}}global %union.Foo undef, align 1
union Foo {
struct Empty {} val;
};
union Foo foo = {};
Loading