-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang][CodeGen] Check initializer of zero-size fields for nullptr #109271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][CodeGen] Check initializer of zero-size fields for nullptr #109271
Conversation
llvm#96422 changed treats empty records as zero-sized for the purpose of layout. In `C`, empty fields were never considered `isZeroSize`, so we would never have tried to call `Init->hasSideEffects` on them. But since llvm#96422 we can get here when compiling `C`, but the `Init` need to exist. This patch adds a null-check to account for this situtation.
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Michael Buch (Michael137) ChangesIn #96422 we started treating empty records as zero-sized for the purpose of layout. In Full diff: https://github.com/llvm/llvm-project/pull/109271.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprConstant.cpp b/clang/lib/CodeGen/CGExprConstant.cpp
index f22321f0e738a1..dd65080a840446 100644
--- a/clang/lib/CodeGen/CGExprConstant.cpp
+++ b/clang/lib/CodeGen/CGExprConstant.cpp
@@ -738,7 +738,7 @@ bool ConstStructBuilder::Build(const InitListExpr *ILE, bool AllowOverwrite) {
// Zero-sized fields are not emitted, but their initializers may still
// prevent emission of this struct as a constant.
if (isEmptyFieldForLayout(CGM.getContext(), Field)) {
- if (Init->HasSideEffects(CGM.getContext()))
+ if (Init && Init->HasSideEffects(CGM.getContext()))
return false;
continue;
}
diff --git a/clang/test/CodeGenCXX/union-empty-field-init.c b/clang/test/CodeGenCXX/union-empty-field-init.c
new file mode 100644
index 00000000000000..1ca8d84473e781
--- /dev/null
+++ b/clang/test/CodeGenCXX/union-empty-field-init.c
@@ -0,0 +1,11 @@
+// RUN: %clang_cc1 %s -emit-llvm -triple x86_64-linux-gnu -o - | FileCheck %s --check-prefixes=CHECK
+// RUN: %clang_cc1 -x c++ %s -emit-llvm -triple x86_64-linux-gnu -o - | FileCheck %s --check-prefixes=CHECK-CXX
+
+union Foo {
+ struct Empty {} val;
+};
+
+union Foo foo = {};
+
+// CHECK: @foo = {{.*}}global %union.Foo undef, align 1
+// CHECK-CXX: @foo = {{.*}}global %union.Foo undef, align 1
|
|
||
union Foo foo = {}; | ||
|
||
// CHECK: @foo = {{.*}}global %union.Foo undef, align 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this seem reasonable to you @efriedma-quic? Essentially this is treated as if we had written:
union Foo {
[[no_unique_address]] struct Empty {} val;
};
union Foo foo = {};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generated code seems okay. I mean, it's the same thing we've always generated for similar constructs. I'd prefer to integrate this into some existing codegen test if we can, though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved it to union-init2.c
. Looked like an appropriate place
|
||
union Foo foo = {}; | ||
|
||
// CHECK: @foo = {{.*}}global %union.Foo undef, align 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generated code seems okay. I mean, it's the same thing we've always generated for similar constructs. I'd prefer to integrate this into some existing codegen test if we can, though...
@@ -738,7 +738,7 @@ bool ConstStructBuilder::Build(const InitListExpr *ILE, bool AllowOverwrite) { | |||
// Zero-sized fields are not emitted, but their initializers may still | |||
// prevent emission of this struct as a constant. | |||
if (isEmptyFieldForLayout(CGM.getContext(), Field)) { | |||
if (Init->HasSideEffects(CGM.getContext())) | |||
if (Init && Init->HasSideEffects(CGM.getContext())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should look at making InitListExpr handling more consistent, though... it looks like we generate an implicit initializer expression for C++ classes, but not other cases. But this should do the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea agreed, wasn't sure if that was an intentional choice or not. I'll take a closer look re. InitListExpr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where C and C++ diverge (in InitListChecker::PerformEmptyInit
):
// C++ [dcl.init.aggr]p7:
// If there are fewer initializer-clauses in the list than there are
// members in the aggregate, then each member not explicitly initialized
// ...
bool EmptyInitList = SemaRef.getLangOpts().CPlusPlus11 &&
Entity.getType()->getBaseElementTypeUnsafe()->isRecordType();
if (EmptyInitList) {
// C++1y / DR1070:
// shall be initialized [...] from an empty initializer list.
//
// We apply the resolution of this DR to C++11 but not C++98, since C++98
// does not have useful semantics for initialization from an init list.
// We treat this as copy-initialization, because aggregate initialization
// always performs copy-initialization on its elements.
//
// Only do this if we're initializing a class type, to avoid filling in
// the initializer list where possible.
InitExpr = VerifyOnly
? &DummyInitList
: new (SemaRef.Context)
InitListExpr(SemaRef.Context, Loc, std::nullopt, Loc);
InitExpr->setType(SemaRef.Context.VoidTy);
SubInit = InitExpr;
Kind = InitializationKind::CreateCopy(Loc, Loc);
} else {
// C++03:
// shall be value-initialized.
}
Technically this was only allowed as a GNU extension in C until C23 AFAIU (and in C++ since C++11).
Should we allow this for SemaRef.getLangOpts().C23
too perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we allow this for SemaRef.getLangOpts().C23 too perhaps?
Empty initialization ends up in C's "default initialization" rules, which does:
If an object that has automatic storage duration is not initialized explicitly, its representation is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, or any object is initialized with an empty initializer, then it is subject to default initialization, which initializes an object as follows:
— if it has pointer type, it is initialized to a null pointer;
— if it has decimal floating type, it is initialized to positive zero, and the quantum exponent is
implementation-defined;
— if it has arithmetic type, and it does not have decimal floating type, it is initialized to (positive
or unsigned) zero;
— if it is an aggregate, every member is initialized (recursively) according to these rules, and any padding is initialized to zero bits;
— if it is a union, the first named member is initialized (recursively) according to these rules, and any padding is initialized to zero bits.
So yeah, I think we should do this in C23 mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming. I'll put up a PR for that separately. Though we'll still have to go ahead with this null check, for the non-C23 case (unless we want to extend this logic for the GNU extension too?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check GCC's behavior and follow their lead since it's a GNU extension, but my hope is that we can extend the logic for the GNU extension as well (I think it's more intuitive).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the comments ("Only do this if we're initializing a class type, to avoid filling in the initializer list where possible"), I guess the member init is being intentionally omitted to try to reduce the size of the AST... but for C++11 classes, we need to do overload resolution etc., so we can't skip it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…lvm#109271) In llvm#96422 we started treating empty records as zero-sized for the purpose of layout. In `C`, empty fields were never considered `isZeroSize`, so we would never have tried to call `Init->hasSideEffects` on them. But since llvm#96422 we can get here when compiling `C`, but `Init` need not exist. This patch adds a null-check to account for this situtation. (cherry picked from commit 2162a18)
In #96422 we started treating empty records as zero-sized for the purpose of layout. In
C
, empty fields were never consideredisZeroSize
, so we would never have tried to callInit->hasSideEffects
on them. But since #96422 we can get here when compilingC
, butInit
need not exist. This patch adds a null-check to account for this situtation.