-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang] Skip auto-init on scalar vars that have a non-constant Init and no self-ref #94642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…lf-ref In that scalar case, the Init should initialize the auto var before use. The Init might use uninitialized memory from other sources (e.g., heap) but auto-init did not help us in that case because the auto-init would have been overwritten by the Init before use. For non-scalars e.g., classes, the Init expr might be a ctor call that leaves uninitialized members, so we leave the auto-init there. The motivation is to have less IR for the optimizer to reduce, which may be in a fairly late pass (DSE) or may not get optimized in lower optimization levels like O1 (no DSe). This is ~10% less left-over auto-init in O1 in a few examples checked.
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Jan Voung (jvoung) ChangesIn that scalar case, the Init should initialize the auto var before use. For non-scalars e.g., classes, the Init expr might be a ctor call that The motivation is to have less IR for the optimizer to later remove, which Full diff: https://github.com/llvm/llvm-project/pull/94642.diff 5 Files Affected:
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index 4a213990d1e36..49e97a23cb0a9 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -1972,7 +1972,20 @@ void CodeGenFunction::EmitAutoVarInit(const AutoVarEmission &emission) {
}
if (!constant) {
- initializeWhatIsTechnicallyUninitialized(Loc);
+ if (trivialAutoVarInit !=
+ LangOptions::TrivialAutoVarInitKind::Uninitialized) {
+ // At this point, we know D has an Init expression, but isn't a constant.
+ // - If D is not a scalar, auto-var-init conservatively (members may be
+ // left uninitialized by constructor Init expressions for example).
+ // - If D is a scalar, we only need to auto-var-init if there is a
+ // self-reference. Otherwise, the Init expression should be sufficient.
+ // It may be that the Init expression uses other uninitialized memory,
+ // but auto-var-init here would not help, as auto-init would get
+ // overwritten by Init.
+ if (!D.getType()->isScalarType() || isAccessedBy(D, Init)) {
+ initializeWhatIsTechnicallyUninitialized(Loc);
+ }
+ }
LValue lv = MakeAddrLValue(Loc, type);
lv.setNonGC(true);
return EmitExprAsInit(Init, &D, lv, capturedByInit);
diff --git a/clang/test/CodeGenCXX/auto-var-init-max-size.cpp b/clang/test/CodeGenCXX/auto-var-init-max-size.cpp
index ef38b8227a9a1..f4db297a07be8 100644
--- a/clang/test/CodeGenCXX/auto-var-init-max-size.cpp
+++ b/clang/test/CodeGenCXX/auto-var-init-max-size.cpp
@@ -15,7 +15,7 @@ struct Foo {
int foo(unsigned n) {
bool var_size_1;
- long var_size_8 = 123;
+ long var_size_8;
void *var_size_8p;
int var_size_1024[256];
Foo var_size_1028;
diff --git a/clang/test/CodeGenCXX/auto-var-init-stop-after.cpp b/clang/test/CodeGenCXX/auto-var-init-stop-after.cpp
index a782692d0127e..f1dc0e3a068e7 100644
--- a/clang/test/CodeGenCXX/auto-var-init-stop-after.cpp
+++ b/clang/test/CodeGenCXX/auto-var-init-stop-after.cpp
@@ -18,7 +18,7 @@ typedef struct {
int foo(unsigned n) {
// scalar variable
- long a = 888;
+ long a;
// array
S arr[ARRLEN];
// VLA
diff --git a/clang/test/CodeGenCXX/auto-var-init.cpp b/clang/test/CodeGenCXX/auto-var-init.cpp
index e1568bee136e5..e697731b0cdf1 100644
--- a/clang/test/CodeGenCXX/auto-var-init.cpp
+++ b/clang/test/CodeGenCXX/auto-var-init.cpp
@@ -146,16 +146,8 @@ struct notlockfree { long long a[4]; };
// PATTERN-O1-NOT: @__const.test_atomictailpad_uninit.uninit
// PATTERN-O0: @__const.test_complexfloat_uninit.uninit = private unnamed_addr constant { float, float } { float 0xFFFFFFFFE0000000, float 0xFFFFFFFFE0000000 }, align 4
// PATTERN-O1-NOT: @__const.test_complexfloat_uninit.uninit
-// PATTERN-O0: @__const.test_complexfloat_braces.braces = private unnamed_addr constant { float, float } { float 0xFFFFFFFFE0000000, float 0xFFFFFFFFE0000000 }, align 4
-// PATTERN-O1-NOT: @__const.test_complexfloat_braces.braces
-// PATTERN-O0: @__const.test_complexfloat_custom.custom = private unnamed_addr constant { float, float } { float 0xFFFFFFFFE0000000, float 0xFFFFFFFFE0000000 }, align 4
-// PATTERN-O1-NOT: @__const.test_complexfloat_custom.custom
// PATTERN-O0: @__const.test_complexdouble_uninit.uninit = private unnamed_addr constant { double, double } { double 0xFFFFFFFFFFFFFFFF, double 0xFFFFFFFFFFFFFFFF }, align 8
// PATTERN-O1-NOT: @__const.test_complexdouble_uninit.uninit
-// PATTERN-O0: @__const.test_complexdouble_braces.braces = private unnamed_addr constant { double, double } { double 0xFFFFFFFFFFFFFFFF, double 0xFFFFFFFFFFFFFFFF }, align 8
-// PATTERN-O1-NOT: @__const.test_complexdouble_braces.braces
-// PATTERN-O0: @__const.test_complexdouble_custom.custom = private unnamed_addr constant { double, double } { double 0xFFFFFFFFFFFFFFFF, double 0xFFFFFFFFFFFFFFFF }, align 8
-// PATTERN-O1-NOT: @__const.test_complexdouble_custom.custom
// PATTERN-O0: @__const.test_semivolatile_uninit.uninit = private unnamed_addr constant %struct.semivolatile { i32 [[I32]], i32 [[I32]] }, align 4
// PATTERN-O0: @__const.test_semivolatile_custom.custom = private unnamed_addr constant %struct.semivolatile { i32 1145324612, i32 1145324612 }, align 4
// PATTERN-O1-NOT: @__const.test_semivolatile_custom.custom
diff --git a/clang/test/CodeGenCXX/trivial-auto-var-init-skip-scalar-with-nonconst-init.cpp b/clang/test/CodeGenCXX/trivial-auto-var-init-skip-scalar-with-nonconst-init.cpp
new file mode 100644
index 0000000000000..4a7b5d2be5799
--- /dev/null
+++ b/clang/test/CodeGenCXX/trivial-auto-var-init-skip-scalar-with-nonconst-init.cpp
@@ -0,0 +1,110 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown %s -emit-llvm -o - | FileCheck %s -check-prefix=UNINIT
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -ftrivial-auto-var-init=pattern %s -emit-llvm -o - | FileCheck %s -check-prefix=PATTERN
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown -ftrivial-auto-var-init=zero %s -emit-llvm -o - | FileCheck %s -check-prefix=ZERO
+
+template<typename T> void used(T &) noexcept;
+
+extern "C" {
+
+extern int get_int(int) noexcept;
+struct C {
+ int x;
+ int y;
+};
+extern C make_c() noexcept;
+
+// Scalar with a self-reference: does need auto-init.
+// UNINIT-LABEL: test_selfinit_call(
+// ZERO-LABEL: test_selfinit_call(
+// ZERO: store i32 0, ptr %self, align 4, !annotation [[AUTO_INIT:!.+]]
+// PATTERN-LABEL: test_selfinit_call(
+// PATTERN: store i32 -1431655766, ptr %self, align 4, !annotation [[AUTO_INIT:!.+]]
+void test_selfinit_call() {
+ int self = get_int(self);
+ used(self);
+}
+
+// Scalar without a self-reference: no auto-init needed.
+// UNINIT-LABEL: test_nonself_call(
+// ZERO-LABEL: test_nonself_call(
+// ZERO-NOT: !annotation [[AUTO_INIT:!.+]]
+// PATTERN-LABEL: test_nonself_call(
+// PATTERN-NOT: !annotation [[AUTO_INIT:!.+]]
+void test_nonself_call() {
+ int x = get_int(2);
+ used(x);
+}
+
+// Scalar with a self-reference: does need auto-init.
+// UNINIT-LABEL: test_selfinit_lambda_call(
+// ZERO-LABEL: test_selfinit_lambda_call(
+// ZERO: store i32 0, ptr %self, align 4, !annotation [[AUTO_INIT:!.+]]
+// PATTERN-LABEL: test_selfinit_lambda_call(
+// PATTERN: store i32 -1431655766, ptr %self, align 4, !annotation [[AUTO_INIT:!.+]]
+void test_selfinit_lambda_call() {
+ int self = [&](){ return self; }();
+ used(self);
+}
+
+// Not a scalar: auto-init just in case
+// UNINIT-LABEL: test_nonscalar_call(
+// ZERO-LABEL: test_nonscalar_call(
+// ZERO: call void @llvm.memset{{.*}}, i8 0, i64 8, {{.*}} !annotation [[AUTO_INIT:!.+]]
+// PATTERN-LABEL: test_nonscalar_call(
+// PATTERN: call void @llvm.memcpy{{.*}}, i64 8, {{.*}} !annotation [[AUTO_INIT:!.+]]
+void test_nonscalar_call() {
+ C c = make_c();
+ used(c);
+}
+
+// Scalar with a self-reference: does need auto-init.
+// UNINIT-LABEL: test_self_ptr(
+// ZERO-LABEL: test_self_ptr(
+// ZERO: store ptr null, ptr %self, align 8, !annotation [[AUTO_INIT:!.+]]
+// PATTERN-LABEL: test_self_ptr(
+// PATTERN: store ptr inttoptr (i64 -6148914691236517206 to ptr), ptr %self, align 8, !annotation [[AUTO_INIT:!.+]]
+void test_self_ptr() {
+ void* self = self;
+ used(self);
+}
+
+// Scalar without a self-reference: no auto-init needed.
+// UNINIT-LABEL: test_nonself_ptr(
+// ZERO-LABEL: test_nonself_ptr(
+// ZERO-NOT: !annotation [[AUTO_INIT:!.+]]
+// PATTERN-LABEL: test_nonself_ptr(
+// PATTERN-NOT: !annotation [[AUTO_INIT:!.+]]
+void test_nonself_ptr() {
+ int y = 0;
+ void* x = &y;
+ used(x);
+}
+
+// Scalar with a self-reference: does need auto-init.
+// UNINIT-LABEL: test_self_complex(
+// ZERO-LABEL: test_self_complex(
+// ZERO: call void @llvm.memset{{.*}} !annotation [[AUTO_INIT:!.+]]
+// PATTERN-LABEL: test_self_complex(
+// PATTERN: call void @llvm.memcpy{{.*}} !annotation [[AUTO_INIT:!.+]]
+void test_self_complex() {
+ _Complex float self = 3.0 * 3.0 * self;
+ used(self);
+}
+
+// Scalar without a self-reference: no auto-init needed.
+// UNINIT-LABEL: test_nonself_complex(
+// ZERO-LABEL: test_nonself_complex(
+// ZERO-NOT: !annotation [[AUTO_INIT:!.+]]
+// PATTERN-LABEL: test_nonself_complex(
+// PATTERN-NOT: !annotation [[AUTO_INIT:!.+]]
+void test_nonself_complex() {
+ _Complex float y = 0.0;
+ _Complex float x = 3.0 * 3.0 * y;
+ used(x);
+}
+
+} // extern "C"
+
+// ZERO: [[AUTO_INIT]] = !{!"auto-init"}
+// PATTERN: [[AUTO_INIT]] = !{!"auto-init"}
+
|
+cc @dwblaikie @aeubanks |
Drive-by comment: can you please either merge the two commits together, or change their descriptions so that they are not 100% identical? |
@@ -15,7 +15,7 @@ struct Foo { | |||
|
|||
int foo(unsigned n) { | |||
bool var_size_1; | |||
long var_size_8 = 123; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have both the test cases in which scalar variables are auto-inited and where the initialization is skipped?
Something along the lines of:
long var_size_8_init = 123;
long var_size_8_noinit;
long var_size_8_init_later;
...
var_size_8_init_later = 456;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Added those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ter) We check that the init doesn't have auto-init of pattern or 0 and we check that noinit and init_later do.
3c09d21
to
e529d26
Compare
Oops, changed the later commit description! |
|
||
// Scalar with a self-reference: does need auto-init. | ||
// UNINIT-LABEL: test_selfinit_lambda_call( | ||
// ZERO-LABEL: test_selfinit_lambda_call( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, if "UNINIT-LABEL", "ZERO-LABEL", and "PATTERN-LABEL" are always the same, just define a common label like "CHECK-LABEL:" to avoid redundancy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if a common "CHECK-LABEL:" would have the same semantics and help split the stream for matching when there is a different prefix like "ZERO: " (or "PATTERN:") that follows, or if the CHECK and the ZERO end up as a independent scans. If it's an independent scan, then "CHECK-LABEL" wouldn't have the same semantics as "ZERO-LABEL:" (which helps split the stream for the "ZERO:" checks that follow -- e.g., in case there a multiple places that could match something generic like "store i32 0, ", the preceding ZERO-LABEL help disambiguate which to match).
I'd prefer to keep the redundancy just to be safe and more clear that the X-LABEL are for the X that it's associated with, unless you're pretty sure it can cross over different check prefixes.
LGTM, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks from the C++ language rules and compiler frontend perspective.
The logic behinds the checks seems solid, there should not be any uncaught reads.
I am still unsure if we are doing the right thing for Objective C's -fblocks
, left a comment asking about it.
In a few tests it appears that when capturedByInit is true, isAccessedBy(Var, Init) is also true. capturedByInit also recurses through the children of the Init expression. Just to be safe, check capturedByInit || isAccessedBy -- and the capturedByInit could help short-circuit in some cases and avoid repeating a recursive traversal of children.
LGTM! |
…nd no self-ref (llvm#94642) In that scalar case, the Init should initialize the auto var before use. The Init might use uninitialized memory from other sources (e.g., heap) but auto-init did not help us in that case because the auto-init would have been overwritten by the Init before use. For non-scalars e.g., classes, the Init expr might be a ctor call that leaves uninitialized members, so we leave the auto-init there. The motivation is to have less IR for the optimizer to later remove, which may not be until a fairly late pass (DSE) or may not get optimized in lower optimization levels like O1 (no DSE) or sometimes due to derefinement. This is ~10% less left-over auto-init in O1 in a few examples checked.
In that scalar case, the Init should initialize the auto var before use.
The Init might use uninitialized memory from other sources (e.g., heap)
but auto-init did not help us in that case because the auto-init would
have been overwritten by the Init before use.
For non-scalars e.g., classes, the Init expr might be a ctor call that
leaves uninitialized members, so we leave the auto-init there.
The motivation is to have less IR for the optimizer to later remove, which
may not be until a fairly late pass (DSE) or may not get optimized in lower
optimization levels like O1 (no DSE) or sometimes due to derefinement.
This is ~10% less left-over auto-init in O1 in a few examples checked