Skip to content

[clang][CodeGen] Zero init unspecified fields in initializers in C #97121

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

yabinc
Copy link
Contributor

@yabinc yabinc commented Jun 28, 2024

When an initializer is provided to a variable, the Linux kernel relied
on the compiler to zero-initialize unspecified fields, as clarified in
https://www.spinics.net/lists/netdev/msg1007244.html.

But clang doesn't guarantee this:

  1. For a union type, if an empty initializer is given, clang only
    initializes bytes for the first field, left bytes for other (larger)
    fields are marked as undef. Accessing those undef bytes can lead
    to undefined behaviors.
  2. For a union type, if an initializer explicitly sets a field, left
    bytes for other (larger) fields are marked as undef.
  3. When an initializer is given, clang doesn't zero initialize padding.

So this patch makes the following change:

  1. In C, when an initializer is provided for a variable, zero-initialize
    undef and padding fields in the initializer.
  2. Document the change in LanguageExtensions.rst.

As suggested in #78034 (comment),
the change isn't required by C23, but it's standards conforming to do
so.

Fixes: #97459

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jun 28, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: None (yabinc)

Changes

When an initializer is provided to a variable, the Linux kernel relies on clang to zero-initialize unspecified fields. But clang doesn't guarantee this:

  1. For a union type, if an empty initializer is given, clang only initializes bytes for the first field, left bytes for other (larger) fields are marked as undef. Accessing those undef bytes can lead to undefined behaviors, causing bugs like reported in https://patchwork.kernel.org/project/netdevbpf/patch/20240621211819.1690234-1-yabinc@google.com/#25906597.
  2. For a union type, if an initializer explicitly sets a field, left bytes for other (larger) fields are marked as undef.
  3. When an initializer is given, clang doesn't zero initialize padding.

So this patch makes the following change:
In C, when an initializer is provided for a variable, zero-initialize undef and padding fields in the initializer.


Patch is 35.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97121.diff

18 Files Affected:

  • (modified) clang/lib/CodeGen/CGDecl.cpp (+43-20)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+6)
  • (modified) clang/lib/CodeGen/CodeGenModule.h (+2)
  • (modified) clang/test/CodeGen/2008-07-22-bitfield-init-after-zero-len-array.c (+1-1)
  • (modified) clang/test/CodeGen/2008-08-07-AlignPadding1.c (+2-2)
  • (modified) clang/test/CodeGen/64bit-swiftcall.c (+5-7)
  • (modified) clang/test/CodeGen/arm-swiftcall.c (+2-2)
  • (modified) clang/test/CodeGen/const-init.c (+2-2)
  • (modified) clang/test/CodeGen/designated-initializers.c (+6-6)
  • (modified) clang/test/CodeGen/flexible-array-init.c (+12-12)
  • (modified) clang/test/CodeGen/global-init.c (+1-1)
  • (modified) clang/test/CodeGen/init.c (-19)
  • (added) clang/test/CodeGen/linux-kernel-struct-union-initializer.c (+212)
  • (modified) clang/test/CodeGen/mingw-long-double.c (+3-6)
  • (modified) clang/test/CodeGen/mms-bitfields.c (+2-2)
  • (modified) clang/test/CodeGen/union-init2.c (+2-2)
  • (modified) clang/test/CodeGen/windows-swiftcall.c (+5-7)
  • (modified) clang/test/CodeGenObjC/designated-initializers.m (+1-1)
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index 90aa4c0745a8a..960add12f323b 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -334,6 +334,13 @@ llvm::Constant *CodeGenModule::getOrCreateStaticVarDecl(
   return Addr;
 }
 
+enum class IsPattern { No, Yes };
+
+static llvm::Constant *replaceUndef(CodeGenModule &CGM, IsPattern isPattern,
+                                    llvm::Constant *constant);
+static llvm::Constant *constWithPadding(CodeGenModule &CGM, IsPattern isPattern,
+                                        llvm::Constant *constant);
+
 /// AddInitializerToStaticVarDecl - Add the initializer for 'D' to the
 /// global variable that has already been created for it.  If the initializer
 /// has a different type than GV does, this may free GV and return a different
@@ -360,6 +367,13 @@ CodeGenFunction::AddInitializerToStaticVarDecl(const VarDecl &D,
     }
     return GV;
   }
+  if (getLangOpts().C99) {
+    // In C, static initialization guarantees that padding is initialized to
+    // zero bits. And the Linux kernel relies on clang to zero-initialize
+    // unspecified fields.
+    Init = constWithPadding(CGM, IsPattern::No,
+                            replaceUndef(CGM, IsPattern::No, Init));
+  }
 
 #ifndef NDEBUG
   CharUnits VarSize = CGM.getContext().getTypeSizeInChars(D.getType()) +
@@ -1037,8 +1051,6 @@ static bool shouldSplitConstantStore(CodeGenModule &CGM,
   return false;
 }
 
-enum class IsPattern { No, Yes };
-
 /// Generate a constant filled with either a pattern or zeroes.
 static llvm::Constant *patternOrZeroFor(CodeGenModule &CGM, IsPattern isPattern,
                                         llvm::Type *Ty) {
@@ -1048,9 +1060,6 @@ static llvm::Constant *patternOrZeroFor(CodeGenModule &CGM, IsPattern isPattern,
     return llvm::Constant::getNullValue(Ty);
 }
 
-static llvm::Constant *constWithPadding(CodeGenModule &CGM, IsPattern isPattern,
-                                        llvm::Constant *constant);
-
 /// Helper function for constWithPadding() to deal with padding in structures.
 static llvm::Constant *constStructWithPadding(CodeGenModule &CGM,
                                               IsPattern isPattern,
@@ -1108,6 +1117,9 @@ static llvm::Constant *constWithPadding(CodeGenModule &CGM, IsPattern isPattern,
     if (ZeroInitializer) {
       OpValue = llvm::Constant::getNullValue(ElemTy);
       PaddedOp = constWithPadding(CGM, isPattern, OpValue);
+      // Avoid iterating large arrays with zero initializer when possible.
+      if (PaddedOp->getType() == ElemTy)
+        return constant;
     }
     for (unsigned Op = 0; Op != Size; ++Op) {
       if (!ZeroInitializer) {
@@ -1953,21 +1965,26 @@ void CodeGenFunction::EmitAutoVarInit(const AutoVarEmission &emission) {
       D.mightBeUsableInConstantExpressions(getContext())) {
     assert(!capturedByInit && "constant init contains a capturing block?");
     constant = ConstantEmitter(*this).tryEmitAbstractForInitializer(D);
-    if (constant && !constant->isZeroValue() &&
-        (trivialAutoVarInit !=
-         LangOptions::TrivialAutoVarInitKind::Uninitialized)) {
-      IsPattern isPattern =
-          (trivialAutoVarInit == LangOptions::TrivialAutoVarInitKind::Pattern)
-              ? IsPattern::Yes
-              : IsPattern::No;
-      // C guarantees that brace-init with fewer initializers than members in
-      // the aggregate will initialize the rest of the aggregate as-if it were
-      // static initialization. In turn static initialization guarantees that
-      // padding is initialized to zero bits. We could instead pattern-init if D
-      // has any ImplicitValueInitExpr, but that seems to be unintuitive
-      // behavior.
-      constant = constWithPadding(CGM, IsPattern::No,
-                                  replaceUndef(CGM, isPattern, constant));
+    if (constant && !constant->isZeroValue()) {
+      if (getLangOpts().C99) {
+        // C guarantees that brace-init with fewer initializers than members in
+        // the aggregate will initialize the rest of the aggregate as-if it were
+        // static initialization. In turn static initialization guarantees that
+        // padding is initialized to zero bits. And the Linux kernel relies on
+        // clang to zero-initialize unspecified fields.
+        constant = constWithPadding(CGM, IsPattern::No,
+                                    replaceUndef(CGM, IsPattern::No, constant));
+      } else if (trivialAutoVarInit !=
+                 LangOptions::TrivialAutoVarInitKind::Uninitialized) {
+        IsPattern isPattern =
+            (trivialAutoVarInit == LangOptions::TrivialAutoVarInitKind::Pattern)
+                ? IsPattern::Yes
+                : IsPattern::No;
+        // We could instead pattern-init padding if D has any
+        // ImplicitValueInitExpr, but that seems to be unintuitive behavior.
+        constant = constWithPadding(CGM, IsPattern::No,
+                                    replaceUndef(CGM, isPattern, constant));
+      }
     }
   }
 
@@ -2849,3 +2866,9 @@ CodeGenModule::getOMPAllocateAlignment(const VarDecl *VD) {
   }
   return std::nullopt;
 }
+
+llvm::Constant *
+CodeGenModule::zeroInitGlobalVarInitializer(llvm::Constant *Init) {
+  return constWithPadding(*this, IsPattern::No,
+                          replaceUndef(*this, IsPattern::No, Init));
+}
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 652f519d82488..a29eafa2fad68 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -5433,6 +5433,12 @@ void CodeGenModule::EmitGlobalVarDefinition(const VarDecl *D,
         Init = llvm::UndefValue::get(getTypes().ConvertType(T));
       }
     } else {
+      if (getLangOpts().C99) {
+        // In C, static initialization guarantees that padding is initialized
+        // to zero bits. And the Linux kernel relies on clang to
+        // zero-initialize unspecified fields.
+        Initializer = zeroInitGlobalVarInitializer(Initializer);
+      }
       Init = Initializer;
       // We don't need an initializer, so remove the entry for the delayed
       // initializer position (just in case this entry was delayed) if we
diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h
index 22b2b314c316c..3222de2a32e6d 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1837,6 +1837,8 @@ class CodeGenModule : public CodeGenTypeCache {
 
   llvm::Metadata *CreateMetadataIdentifierImpl(QualType T, MetadataTypeMap &Map,
                                                StringRef Suffix);
+
+  llvm::Constant *zeroInitGlobalVarInitializer(llvm::Constant *Init);
 };
 
 }  // end namespace CodeGen
diff --git a/clang/test/CodeGen/2008-07-22-bitfield-init-after-zero-len-array.c b/clang/test/CodeGen/2008-07-22-bitfield-init-after-zero-len-array.c
index b72d689659e60..b639734ef5d4b 100644
--- a/clang/test/CodeGen/2008-07-22-bitfield-init-after-zero-len-array.c
+++ b/clang/test/CodeGen/2008-07-22-bitfield-init-after-zero-len-array.c
@@ -8,4 +8,4 @@ struct et7 {
   52, 
 };
 
-// CHECK: @yv7 ={{.*}} global %struct.et7 { [0 x float] zeroinitializer, i8 52 }
+// CHECK: @yv7 ={{.*}} global { [0 x float], i8, [3 x i8] } { [0 x float] zeroinitializer, i8 52, [3 x i8] zeroinitializer }
diff --git a/clang/test/CodeGen/2008-08-07-AlignPadding1.c b/clang/test/CodeGen/2008-08-07-AlignPadding1.c
index 17e88ce02659f..d69cbc22cc1df 100644
--- a/clang/test/CodeGen/2008-08-07-AlignPadding1.c
+++ b/clang/test/CodeGen/2008-08-07-AlignPadding1.c
@@ -20,9 +20,9 @@ struct gc_generation {
 
 #define GEN_HEAD(n) (&generations[n].head)
 
-// The idea is that there are 6 undefs in this structure initializer to cover
+// The idea is that there are 6 zeroinitializers in this structure initializer to cover
 // the padding between elements.
-// CHECK: @generations ={{.*}} global [3 x %struct.gc_generation] [%struct.gc_generation { %union._gc_head { %struct.anon { ptr @generations, ptr @generations, i64 0 }, [8 x i8] undef }, i32 700, i32 0, [8 x i8] undef }, %struct.gc_generation { %union._gc_head { %struct.anon { ptr getelementptr (i8, ptr @generations, i64 48), ptr getelementptr (i8, ptr @generations, i64 48), i64 0 }, [8 x i8] undef }, i32 10, i32 0, [8 x i8] undef }, %struct.gc_generation { %union._gc_head { %struct.anon { ptr getelementptr (i8, ptr @generations, i64 96), ptr getelementptr (i8, ptr @generations, i64 96), i64 0 }, [8 x i8] undef }, i32 10, i32 0, [8 x i8] undef }]
+// CHECK: @generations ={{.*}} global [3 x %struct.gc_generation] [%struct.gc_generation { %union._gc_head { %struct.anon { ptr @generations, ptr @generations, i64 0 }, [8 x i8] zeroinitializer }, i32 700, i32 0, [8 x i8] zeroinitializer }, %struct.gc_generation { %union._gc_head { %struct.anon { ptr getelementptr (i8, ptr @generations, i64 48), ptr getelementptr (i8, ptr @generations, i64 48), i64 0 }, [8 x i8] zeroinitializer }, i32 10, i32 0, [8 x i8] zeroinitializer }, %struct.gc_generation { %union._gc_head { %struct.anon { ptr getelementptr (i8, ptr @generations, i64 96), ptr getelementptr (i8, ptr @generations, i64 96), i64 0 }, [8 x i8] zeroinitializer }, i32 10, i32 0, [8 x i8] zeroinitializer }]
 /* linked lists of container objects */
 struct gc_generation generations[3] = {
         /* PyGC_Head,                           threshold,      count */
diff --git a/clang/test/CodeGen/64bit-swiftcall.c b/clang/test/CodeGen/64bit-swiftcall.c
index b1c42e3b0a657..4b0143e37afbf 100644
--- a/clang/test/CodeGen/64bit-swiftcall.c
+++ b/clang/test/CodeGen/64bit-swiftcall.c
@@ -14,8 +14,6 @@
 
 // CHECK-DAG: %struct.atomic_padded = type { { %struct.packed, [7 x i8] } }
 // CHECK-DAG: %struct.packed = type <{ i64, i8 }>
-//
-// CHECK: [[STRUCT2_RESULT:@.*]] = private {{.*}} constant [[STRUCT2_TYPE:%.*]] { i32 0, i8 0, i8 undef, i8 0, i32 0, i32 0 }
 
 /*****************************************************************************/
 /****************************** PARAMETER ABIS *******************************/
@@ -162,8 +160,8 @@ typedef struct {
 } struct_2;
 TEST(struct_2);
 // CHECK-LABEL: define{{.*}} swiftcc { i64, i64 } @return_struct_2() {{.*}}{
-// CHECK:   [[RET:%.*]] = alloca [[STRUCT2_TYPE]], align 4
-// CHECK:   call void @llvm.memcpy{{.*}}({{.*}}[[RET]], {{.*}}[[STRUCT2_RESULT]]
+// CHECK:   [[RET:%.*]] = alloca [[STRUCT2:%.*]], align 4
+// CHECK:   call void @llvm.memset
 // CHECK:   [[GEP0:%.*]] = getelementptr inbounds { i64, i64 }, ptr [[RET]], i32 0, i32 0
 // CHECK:   [[T0:%.*]] = load i64, ptr [[GEP0]], align 4
 // CHECK:   [[GEP1:%.*]] = getelementptr inbounds { i64, i64 }, ptr [[RET]], i32 0, i32 1
@@ -173,7 +171,7 @@ TEST(struct_2);
 // CHECK:   ret { i64, i64 } [[R1]]
 // CHECK: }
 // CHECK-LABEL: define{{.*}} swiftcc void @take_struct_2(i64 %0, i64 %1) {{.*}}{
-// CHECK:   [[V:%.*]] = alloca [[STRUCT:%.*]], align 4
+// CHECK:   [[V:%.*]] = alloca [[STRUCT2]], align 4
 // CHECK:   [[GEP0:%.*]] = getelementptr inbounds { i64, i64 }, ptr [[V]], i32 0, i32 0
 // CHECK:   store i64 %0, ptr [[GEP0]], align 4
 // CHECK:   [[GEP1:%.*]] = getelementptr inbounds { i64, i64 }, ptr [[V]], i32 0, i32 1
@@ -181,7 +179,7 @@ TEST(struct_2);
 // CHECK:   ret void
 // CHECK: }
 // CHECK-LABEL: define{{.*}} void @test_struct_2() {{.*}} {
-// CHECK:   [[TMP:%.*]] = alloca [[STRUCT2_TYPE]], align 4
+// CHECK:   [[TMP:%.*]] = alloca [[STRUCT2]], align 4
 // CHECK:   [[CALL:%.*]] = call swiftcc { i64, i64 } @return_struct_2()
 // CHECK:   [[GEP:%.*]] = getelementptr inbounds {{.*}} [[TMP]], i32 0, i32 0
 // CHECK:   [[T0:%.*]] = extractvalue { i64, i64 } [[CALL]], 0
@@ -254,7 +252,7 @@ typedef union {
 TEST(union_het_fp)
 // CHECK-LABEL: define{{.*}} swiftcc i64 @return_union_het_fp()
 // CHECK:  [[RET:%.*]] = alloca [[UNION:%.*]], align 8
-// CHECK:  call void @llvm.memcpy{{.*}}(ptr align 8 [[RET]]
+// CHECK:  call void @llvm.memset{{.*}}(ptr align 8 [[RET]]
 // CHECK:  [[GEP:%.*]] = getelementptr inbounds { i64 }, ptr [[RET]], i32 0, i32 0
 // CHECK:  [[R0:%.*]] = load i64, ptr [[GEP]], align 8
 // CHECK:  ret i64 [[R0]]
diff --git a/clang/test/CodeGen/arm-swiftcall.c b/clang/test/CodeGen/arm-swiftcall.c
index 9fa607a968ccf..78ceed5926014 100644
--- a/clang/test/CodeGen/arm-swiftcall.c
+++ b/clang/test/CodeGen/arm-swiftcall.c
@@ -172,7 +172,7 @@ typedef struct {
 TEST(struct_2);
 // CHECK-LABEL: define{{.*}} @return_struct_2()
 // CHECK:   [[RET:%.*]] = alloca [[REC:%.*]], align 4
-// CHECK:   @llvm.memcpy
+// CHECK:   @llvm.memset
 // CHECK:   [[T0:%.*]] = getelementptr inbounds [[AGG:{ i32, i32, float, float }]], ptr [[RET]], i32 0, i32 0
 // CHECK:   [[FIRST:%.*]] = load i32, ptr [[T0]], align 4
 // CHECK:   [[T0:%.*]] = getelementptr inbounds [[AGG]], ptr [[RET]], i32 0, i32 1
@@ -274,7 +274,7 @@ typedef union {
 TEST(union_het_fp)
 // CHECK-LABEL: define{{.*}} @return_union_het_fp()
 // CHECK:   [[RET:%.*]] = alloca [[REC:%.*]], align {{(4|8)}}
-// CHECK:   @llvm.memcpy
+// CHECK:   @llvm.memset
 // CHECK:   [[T0:%.*]] = getelementptr inbounds [[AGG:{ i32, i32 }]], ptr [[RET]], i32 0, i32 0
 // CHECK:   [[FIRST:%.*]] = load i32, ptr [[T0]], align {{(4|8)}}
 // CHECK:   [[T0:%.*]] = getelementptr inbounds [[AGG]], ptr [[RET]], i32 0, i32 1
diff --git a/clang/test/CodeGen/const-init.c b/clang/test/CodeGen/const-init.c
index ad3e9551199ac..fc973cb983a80 100644
--- a/clang/test/CodeGen/const-init.c
+++ b/clang/test/CodeGen/const-init.c
@@ -170,7 +170,7 @@ void g30(void) {
     int : 1;
     int x;
   } a = {};
-  // CHECK: @g30.a = internal global %struct.anon.1 <{ i8 undef, i32 0 }>, align 1
+  // CHECK: @g30.a = internal global %struct.anon.1 zeroinitializer, align 1
 #pragma pack()
 }
 
@@ -182,7 +182,7 @@ void g31(void) {
     short z;
   } a = {23122, -12312731, -312};
 #pragma pack()
-  // CHECK: @g31.a = internal global %struct.anon.2 { i16 23122, i32 -12312731, i16 -312 }, align 4
+  // CHECK: @g31.a = internal global { i16, [2 x i8], i32, i16, [2 x i8] } { i16 23122, [2 x i8] zeroinitializer, i32 -12312731, i16 -312, [2 x i8] zeroinitializer }, align 4
 }
 
 // Clang should evaluate this in constant context, so floating point mode should
diff --git a/clang/test/CodeGen/designated-initializers.c b/clang/test/CodeGen/designated-initializers.c
index 620b1b90d2575..56600f39e8bb2 100644
--- a/clang/test/CodeGen/designated-initializers.c
+++ b/clang/test/CodeGen/designated-initializers.c
@@ -8,7 +8,7 @@ struct foo {
 // CHECK: @u ={{.*}} global %union.anon zeroinitializer
 union { int i; float f; } u = { };
 
-// CHECK: @u2 ={{.*}} global { i32, [4 x i8] } { i32 0, [4 x i8] undef }
+// CHECK: @u2 ={{.*}} global { i32, [4 x i8] } zeroinitializer
 union { int i; double f; } u2 = { };
 
 // CHECK: @u3 ={{.*}} global  %union.anon.1 zeroinitializer
@@ -62,22 +62,22 @@ struct overwrite_string_struct2 {
   char L[6];
   int M;
 } overwrite_string2[] = { { { "foo" }, 1 }, [0].L[2] = 'x'};
-// CHECK: [6 x i8] c"fox\00\00\00", i32 1
+// CHECK: [6 x i8] c"fox\00\00\00", [2 x i8] zeroinitializer, i32 1
 struct overwrite_string_struct3 {
   char L[3];
   int M;
 } overwrite_string3[] = { { { "foo" }, 1 }, [0].L[2] = 'x'};
-// CHECK: [3 x i8] c"fox", i32 1
+// CHECK: [3 x i8] c"fox", [1 x i8] zeroinitializer, i32 1
 struct overwrite_string_struct4 {
   char L[3];
   int M;
 } overwrite_string4[] = { { { "foobar" }, 1 }, [0].L[2] = 'x'};
-// CHECK: [3 x i8] c"fox", i32 1
+// CHECK: [3 x i8] c"fox", [1 x i8] zeroinitializer, i32 1
 struct overwrite_string_struct5 {
   char L[6];
   int M;
 } overwrite_string5[] = { { { "foo" }, 1 }, [0].L[4] = 'y'};
-// CHECK: [6 x i8] c"foo\00y\00", i32 1
+// CHECK: [6 x i8] c"foo\00y\00", [2 x i8] zeroinitializer, i32 1
 
 
 // CHECK: @u1 = {{.*}} { i32 65535 }
@@ -138,7 +138,7 @@ union_16644_t union_16644_instance_4[2] =
   [1].b[1] = 4
 };
 
-// CHECK: @lab ={{.*}} global { [4 x i8], i32 } { [4 x i8] undef, i32 123 }
+// CHECK: @lab ={{.*}} global { [4 x i8], i32 } { [4 x i8] zeroinitializer, i32 123 }
 struct leading_anon_bitfield { int : 32; int n; } lab = { .n = 123 };
 
 struct Base {
diff --git a/clang/test/CodeGen/flexible-array-init.c b/clang/test/CodeGen/flexible-array-init.c
index 15a30c15ac966..08b5aaf63eaea 100644
--- a/clang/test/CodeGen/flexible-array-init.c
+++ b/clang/test/CodeGen/flexible-array-init.c
@@ -14,11 +14,11 @@ struct { int y[]; } b1 = { { 14, 16 } };
 
 // sizeof(c) == 8, so this global should be at least 8 bytes.
 struct { int x; char c; char y[]; } c = { 1, 2, { 13, 15 } };
-// CHECK: @c ={{.*}} global { i32, i8, [2 x i8] } { i32 1, i8 2, [2 x i8] c"\0D\0F" }
+// CHECK: @c ={{.*}} global { i32, i8, [2 x i8], [1 x i8] } { i32 1, i8 2, [2 x i8] c"\0D\0F", [1 x i8] zeroinitializer }
 
 // sizeof(d) == 8, so this global should be at least 8 bytes.
 struct __attribute((packed, aligned(4))) { char a; int x; char z[]; } d = { 1, 2, { 13, 15 } };
-// CHECK: @d ={{.*}} <{ i8, i32, [2 x i8], i8 }> <{ i8 1, i32 2, [2 x i8] c"\0D\0F", i8 undef }>,
+// CHECK: @d ={{.*}} <{ i8, i32, [2 x i8], i8 }> <{ i8 1, i32 2, [2 x i8] c"\0D\0F", i8 0 }>,
 
 // This global needs 9 bytes to hold all the flexible array members.
 struct __attribute((packed, aligned(4))) { char a; int x; char z[]; } e = { 1, 2, { 13, 15, 17, 19 } };
@@ -55,21 +55,21 @@ struct { int a; union { int b; short x[]; }; int c; int d; } hf = {1, 2, {}, 3};
 
 // First member is the potential flexible array, initialization requires braces.
 struct { int a; union { short x; int b; }; int c; int d; } i = {1, 2, {}, 3};
-// CHECK: @i = global { i32, { i16, [2 x i8] }, i32, i32 } { i32 1, { i16, [2 x i8] } { i16 2, [2 x i8] undef }, i32 0, i32 3 }
+// CHECK: @i = global { i32, { i16, [2 x i8] }, i32, i32 } { i32 1, { i16, [2 x i8] } { i16 2, [2 x i8] zeroinitializer }, i32 0, i32 3 }
 struct { int a; union { short x[0]; int b; }; int c; int d; } i0 = {1, {}, 2, 3};
-// CHECK: @i0 = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 1, { [0 x i16], [4 x i8] } { [0 x i16] zeroinitializer, [4 x i8] undef }, i32 2, i32 3 }
+// CHECK: @i0 = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 1, { [0 x i16], [4 x i8] } zeroinitializer, i32 2, i32 3 }
 struct { int a; union { short x[1]; int b; }; int c; int d; } i1 = {1, {2}, {}, 3};
-// CHECK: @i1 = global { i32, { [1 x i16], [2 x i8] }, i32, i32 } { i32 1, { [1 x i16], [2 x i8] } { [1 x i16] [i16 2], [2 x i8] undef }, i32 0, i32 3 }
+// CHECK: @i1 = global { i32, { [1 x i16], [2 x i8] }, i32, i32 } { i32 1, { [1 x i16], [2 x i8] } { [1 x i16] [i16 2], [2 x i8] zeroinitializer }, i32 0, i32 3 }
 struct { int a; union { short x[]; int b; }; int c; int d; } i_f = {4, {}, {}, 6};
-// CHECK: @i_f = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 4, { [0 x i16], [4 x i8] } { [0 x i16] zeroinitializer, [4 x i8] undef }, i32 0, i32 6 }
+// CHECK: @i_f = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 4, { [0 x i16], [4 x i8] } zeroinitializer, i32 0, i32 6 }
 
 // Named initializers; order doesn't matter.
 struct { int a; union { int b; short x; }; int c; int d; } hn = {.a = 1, .x = 2, .c = 3};
-// CHECK: @hn = global { i32, { i16, [2 x i8] }, i32, i32 } { i32 1, { i16, [2 x i8] } { i16 2, [2 x i8] undef }, i32 3, i32 0 }
+// CHECK: @hn = global { i32, { i16, [2 x i8] }, i32, i32 } { i32 1, { i16, [2 x i8] } { i16 2, [2 x i8] zeroinitializer }, i32 3, i32 0 }
 struct { int a; union { int b; short x[0]; }; int c; int d; } hn0 = {.a = 1, .x = {2}, .c = 3};
-// CHECK: @hn0 = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 1, { [0 x i16], [4 x i8] } { [0 x i16] zeroinitializer, [4 x i8] undef }, i32 3, i32 0 }
+// CHECK: @hn0 = global { i32, { [0 x i16], [4 x i8] }, i32, i32 } { i32 1, { [0 x i16], [4 x i8] } zeroinitializer, i32 3, i32 0 }
 struct { int a; union { int b; short x[1]; }; int c; int d; } hn1 = {.a = 1, .x = {2}, .c = 3};
-// CHECK: @hn1 = global { i32, { [1 x i16], [2 x i8] }, i32, i32 } { i32 1, { [1 x i16], [2 x i8] } { [1 x i16] [i16 2], [2 x i8] undef }, i32 3, i32 0 }
+// CHECK: @hn1 = global { i32, { [1 x i16], [2 x i8] }, i32, i32 } { i32 1, { [1 x i16], [2 x i8] } { [1 x i16] [i16 2], [2 x i8] zeroinitializer }, i32 3, i32 0 }
 
 struct { char a[]; } empty_struct = {};
 // CHECK: @empty_struct ={{.*}} global %struct.anon{{.*}} zeroinitializer, align 1
@@ -96,10 +96,10 @@ union { char a[]; } only_in_union0 = {0};
 // CHECK: @only_in_union0 = global { [1 x i8] } zeroi...
[truncated]

@yabinc yabinc linked an issue Jun 28, 2024 that may be closed by this pull request
@efriedma-quic
Copy link
Collaborator

The only place the standard requires padding to be zero-initialized is in "default initialization", which only applies to members which don't have an initializer. So, for example, if you have struct X { int x; char y; } z = {1};, none of the padding is required to be initialized; the only default initialization is of "y", which doesn't have any padding. Similarly, if you have union X { char x; int y; } z = {1};, none of the padding is required to be initialized; there's no default initialization at all. On the other hand, if you write union X { char x; int y; } z = {};, that padding is required to be initialized.

There's probably some argument for doing what this patch does, in addition to what the standard requires. But to avoid any confusion, please file a separate bug for that, and make sure the comments in the patch distinguish between what the standard requires and what we actually implement.

@yabinc yabinc force-pushed the zero-init-unspecified-fields-in-initializers-in-c branch from bf3f611 to d3aa71b Compare July 2, 2024 20:44
@yabinc yabinc changed the title [clang][CodeGen] Zero init unspecified fields in initializers in C (#78034) [clang][CodeGen] Zero init unspecified fields in initializers in C (#97459) Jul 2, 2024
@yabinc yabinc removed a link to an issue Jul 2, 2024
@yabinc
Copy link
Contributor Author

yabinc commented Jul 2, 2024

@efriedma-quic, Thanks for the suggestion!
I have created a new bug in #97459, and also updated commit message to clarify that this is needed by the linux kernel instead of c23 standard.

@yabinc yabinc force-pushed the zero-init-unspecified-fields-in-initializers-in-c branch from d3aa71b to fe87409 Compare July 4, 2024 06:08
@yabinc
Copy link
Contributor Author

yabinc commented Jul 9, 2024

@efriedma-quic, @rjmccall , @AaronBallman , @nickdesaulniers , ping for review or comments?

@AaronBallman AaronBallman added the c label Jul 10, 2024
Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! I'll leave the codegen code review to John and Eli, but I would appreciate explicit documentation in clang/docs/LanguageExtensions.rst so that we explicitly document this behavior and users (including the Linux kernel) can rely on it longterm.

@yabinc yabinc force-pushed the zero-init-unspecified-fields-in-initializers-in-c branch from fe87409 to 4166d86 Compare July 22, 2024 23:52
@AaronBallman
Copy link
Collaborator

Ping @rjmccall @efriedma-quic -- I'd like to see this get into the release so that the Linux kernel can be built with Clang 19.

@pirama-arumuga-nainar
Copy link
Collaborator

Ping @rjmccall @efriedma-quic. Any comments on this patch?

@yabinc
Copy link
Contributor Author

yabinc commented Jul 30, 2024

Test case showing difference in behavior w.r.t. gcc (without this patch) https://godbolt.org/z/9fPe78cEz

@yabinc yabinc changed the title [clang][CodeGen] Zero init unspecified fields in initializers in C (#97459) [clang][CodeGen] The linux kernel expects initializers to FULLY initialize variables, structs and unions including padding Jul 30, 2024
@yabinc yabinc force-pushed the zero-init-unspecified-fields-in-initializers-in-c branch from 4166d86 to 6fe369d Compare July 30, 2024 23:30
Copy link
Contributor

@rjmccall rjmccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really all that's required? It looks like you're just filling in explicit zero padding when emitting constant initializers. That should steer clear of any possibility that LLVM would treat the padding as undef for optimization purposes (surely undef bytes are still emitted as zero when rendering the initializer in assembly? it doesn't hurt to be more explicit, of course), so okay, it's good to do. But I assume that similar code is required in order to explicitly zero out any padding when dynamically initializing a local variable.

Also, is adding constant padding fields retroactively really the best way to handle this instead of just expecting the constant emitter to fill them in itself?

if (!getLangOpts().CPlusPlus) {
// In C, when an initializer is given, the Linux kernel relies on clang to
// zero-initialize all members not explicitly initialized, including padding
// bits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments in the compiler generally shouldn't refer to specific projects. We are making a guarantee based on our interpretation of the standard.

Similarly, the PR title is also inappropriate. It is fine to note in the PR description that Linux relies on this for correctness, but the PR title should describe the semantic change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think we should do this unconditionally in all language modes. It's not like this is forbidden in C++.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I assume that similar code is required in order to explicitly zero out any padding when dynamically initializing a local variable.

I guess you mean variables assigned by Compound literals after variable definition. An example is in https://godbolt.org/z/caaEKnjWe. From the ast dump, it is treated as an assignment from InitListExpr. The source code seems to be in AggExprEmitter::VisitCXXParenListOrInitListExpr() in https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGExprAgg.cpp#L1757. Each field of the InitListExpr is stored to the variable separately. One solution is to always do memset in CheckAggExprForMemSetUse() in https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGExprAgg.cpp#L1974. The memset can be optimized away later if not needed.
Another place is compound literals for global values. It seems handled in CGM.GetAddrOfConstantCompoundLiteral().
To not introduce a big change at once, Shall I handle them in a follow up patch?

Also, is adding constant padding fields retroactively really the best way to handle this instead of just expecting the constant emitter to fill them in itself?

Currently I limit the change to C and reuse code for trivial auto var zero init. So it's easier to do it outside the constant emitter. If the change can be extended to C++, I feel it's better to always do it in the constant emitter.

Comments in the compiler generally shouldn't refer to specific projects. We are making a guarantee based on our interpretation of the standard.

The standard is ambiguous on this, so can't be used as a reason for the change. Please let me know if you have better idea about how to reword the comment.

Similarly, the PR title is also inappropriate. It is fine to note in the PR description that Linux relies on this for correctness, but the PR title should describe the semantic change.

Changed the PR title back to the commit title, which describes the semantic change.

Also, I think we should do this unconditionally in all language modes. It's not like this is forbidden in C++.

I agree. But when I was trying to do it before, I need to fix a lot more tests. And the Linux kernel only has C code. I don't have strong reason to push this to C++. So I want to limit the change to C at first. If reviewers think we should also do it in C++, shall I do it in a follow up change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I assume that similar code is required in order to explicitly zero out any padding when dynamically initializing a local variable.

I guess you mean variables assigned by Compound literals after variable definition. An example is in https://godbolt.org/z/caaEKnjWe.

No, I mean initializing a local aggregate with a non-constant initializer, e.g.

void g(char c) {
  struct A a = { c, 2 };
}

Do we already zero the padding correctly in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we already zero the padding correctly in this case?

No, the non-constant initializer goes through the same implemented path as "variables assigned by Compound literals after variable definition". I guess I'd better do it in the same patch. Do you have any suggestions on how to implement it? Otherwise I'll try always do memset in CheckAggExprForMemSetUse() in https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CGExprAgg.cpp#L1974.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard is ambiguous on this, so can't be used as a reason for the change. Please let me know if you have better idea about how to reword the comment.

I suggest quoting the standard, identifying the ambiguity, and then saying that we've chosen to interpret it as requiring all padding in the object to be zeroed. Since there are multiple paths in the compiler that need this logic, you may want to write this in a single place and then cross-reference it from other places.

I agree. But when I was trying to do it before, I need to fix a lot more tests. And the Linux kernel only has C code. I don't have strong reason to push this to C++. So I want to limit the change to C at first. If reviewers think we should also do it in C++, shall I do it in a follow up change?

I don't mind staging it in over multiple commits in principle. If you'd like to switch to an implementation that adds explicit padding fields when building the original constant, you could still stage that and only add those fields in certain language modes.

No, the non-constant initializer goes through the same implemented path as "variables assigned by Compound literals after variable definition". I guess I'd better do it in the same patch. Do you have any suggestions on how to implement it?

We probably do need to consider it when deciding whether to memset, but naively I'd expect that you'd just emit little memsets for each individual piece of padding when we recognize that there's a gap between successive fields. Does that not work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest quoting the standard, identifying the ambiguity, and then saying that we've chosen to interpret it as requiring all padding in the object to be zeroed. Since there are multiple paths in the compiler that need this logic, you may want to write this in a single place and then cross-reference it from other places.

Done.

I don't mind staging it in over multiple commits in principle. If you'd like to switch to an implementation that adds explicit padding fields when building the original constant, you could still stage that and only add those fields in certain language modes.

I am OOO next week, will try it in ConstantEmitter after that.

We probably do need to consider it when deciding whether to memset, but naively I'd expect that you'd just emit little memsets for each individual piece of padding when we recognize that there's a gap between successive fields. Does that not work?

It works. Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind staging it in over multiple commits in principle. If you'd like to switch to an implementation that adds explicit padding fields when building the original constant, you could still stage that and only add those fields in certain language modes.

Changed as suggested, adding explicit padding fields when building the original constant.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding handling for C++ is tricky: C++ allows fields and base classes to overlap. For static initialization, it's straightforward to check for overlap, but for dynamic initialization it's problematic. In some cases (AggValueSlot::DoesNotOverlap), we can prove there is no overlap, but in general it's ambiguous. In such cases, we have to leave padding uninitialized. So I think sticking to C for the first iteration makes sense; we'll have to carefully consider exactly what guarantees we can provide for C++.

@yabinc yabinc changed the title [clang][CodeGen] The linux kernel expects initializers to FULLY initialize variables, structs and unions including padding [clang][CodeGen] Zero init unspecified fields in initializers in C Jul 31, 2024
@yabinc yabinc force-pushed the zero-init-unspecified-fields-in-initializers-in-c branch from 6fe369d to e1d7577 Compare August 3, 2024 01:15
Copy link

github-actions bot commented Aug 3, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably makes sense to add a helper like the following to CodeGenModule or something like that:

bool shouldZeroInitPadding() { 
  // Comment explaining reasoning for this rule
  return !CGM.getLangOpts().CPlusPlus;
}

So we don't have // See comment in getPadding(). all over the place.


Is there any relevant code using EmitNullConstant()? As far as I can tell, this patch doesn't zero-fill implicit padding in null constants (like in struct { int x; long y; }.

@yabinc
Copy link
Contributor Author

yabinc commented Aug 30, 2024

It probably makes sense to add a helper like the following to CodeGenModule or something like that:

bool shouldZeroInitPadding() { 
  // Comment explaining reasoning for this rule
  return !CGM.getLangOpts().CPlusPlus;
}

So we don't have // See comment in getPadding(). all over the place.

Done.

Is there any relevant code using EmitNullConstant()? As far as I can tell, this patch doesn't zero-fill implicit padding in null constants (like in struct { int x; long y; }.

I ran all tests in clang/tests and found two places using it:

  1. One is used in CodeGenFunction::EmitNullInitialization(Address DestPtr, QualType Ty), when !CGM.getTypes().isZeroInitializable(Ty). This seems to only happen in C++ code (when a class contains a pointer pointing
    to data member).
  2. Another is used in CodeGenModule::getOrCreateStaticVarDecl(), which is further called in three places:
    a. CodeGenFunction::AddInitializerToStaticVarDecl(): it later calls ConstantEmitter::tryEmitForInitializer() if an initializer is used.
    b. CodeGenFunction::EmitDeclRefLValue(const DeclRefExpr *E) in CGExpr.cpp
    c. ConstantLValueEmitter::tryEmitBase(const APValue::LValueBase &base) in CGExprConstant.cpp
    Not sure what b and c are doing.
    When running tests for non c++ code, I only found the 2.a path, used in CodeGenOpenCL/amdgpu-nullptr.cl.
    So I didn't find any C code using EmitNullConstant(), that results in uninitialized fields with initializers.

For the implementation of EmitNullConstant(), it returns ConstantStruct::get(llvm::ConstantStruct::get(structure, elements). In ConstantStruct::get(), if all elements are zero, then it returns ConstantAggregateZero::get(ST).
From https://llvm.org/docs/LangRef.html:
The string ‘zeroinitializer’ can be used to zero initialize a value to zero of any type, including scalar and [aggregate](https://llvm.org/docs/LangRef.html#t-aggregate) types. This is often used to avoid having to print large zero initializers (e.g. for large arrays) and is always exactly equivalent to using explicit zero initializers.

From the above words, I feel using %struct.S2 zeroinitializer may not guarantee the padding in the struct are zero initialized. But if it is used for a static variable created by CodeGenModule::getOrCreateStaticVarDecl(), it's
as good as a static/global variable without initializer.

@yabinc
Copy link
Contributor Author

yabinc commented Sep 5, 2024

@rjmccall and @efriedma-quic, ping for review?

@pirama-arumuga-nainar
Copy link
Collaborator

@rjmccall and @efriedma-quic, do you have any other comments on this PR?

if (!DoZeroInitPadding(Layout, FieldNo, *Field, SizeSoFar,
IsFlexibleArray, AllowOverwrite))
return false;
if (IsFlexibleArray)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've figured out the way IsFlexibleArray works, but it's sort of awkward. Do you really need to use a different codepath for flexible-array fields, vs. ordinary fields? It seems like it would be simpler to just use getTypeAllocSize() in both cases. And the name "IsFlexibleArray" is a bit confusing because isn't really reliably detecting flexible arrays; there are other zero-size fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed IsFlexibleArray to FieldSize.
If you mean to use getTypeAllocSize() in DoZeroInitPadding(),
I don't know how to replace ASTContext::getTypeSizeInChars(clang::QualType T) with DataLayout::getTypeAllocSize(llvm::Type *Ty).

If you mean to always use CGM.getDataLayout().getTypeAllocSize(EltInit->getType()), there is a case when EltInit isn't avaiable (when emitting a DesignatedInitUpdateExpr with a nested InitListExpr).

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I think we'll need to watch out for performance regressions... in most cases, we can probably zero-pad cheaply, but there might be cases where it isn't cheap, or cases where it should be cheap but the LLVM backend can't figure it out. There's a limit to how much we can do before this lands... but maybe helpful if you can show Linux kernel codesize as a reference, to show there aren't massive issues.

@yabinc yabinc closed this Sep 24, 2024
@yabinc yabinc force-pushed the zero-init-unspecified-fields-in-initializers-in-c branch from 585e8ef to 0907552 Compare September 24, 2024 21:33
hen an initializer is provided to a variable, the Linux kernel relied
on the compiler to zero-initialize unspecified fields, as clarified in
https://www.spinics.net/lists/netdev/msg1007244.html.

But clang doesn't guarantee this:
1. For a union type, if an empty initializer is given, clang only
   initializes bytes for the first field, left bytes for other (larger)
   fields are marked as undef. Accessing those undef bytes can lead
   to undefined behaviors.
2. For a union type, if an initializer explicitly sets a field, left
   bytes for other (larger) fields are marked as undef.
3. When an initializer is given, clang doesn't zero initialize padding.

So this patch makes the following change:
1. In C, when an initializer is provided for a variable, zero-initialize
   undef and padding fields in the initializer.
2. Document the change in LanguageExtensions.rst.

As suggested in llvm#78034 (comment),
the change isn't required by C23, but it's standards conforming to do
so.
@yabinc yabinc reopened this Sep 24, 2024
@yabinc
Copy link
Contributor Author

yabinc commented Sep 24, 2024

but maybe helpful if you can show Linux kernel codesize as a reference, to show there aren't massive issues.

I tested it on android-mainline, which closely follows linux upstream. The size change is very small.
.text 0x1139000 -> 0x1138000
.init.text 0x061f08 -> 0x062050
.data 0x20bec8 -> 0x20bf08

The linux kernel enables -ftrivial-auto-var-init=zero by default, which I guess already covers most (if not all) initialization instructions.

@yabinc
Copy link
Contributor Author

yabinc commented Sep 24, 2024

I manually squashed all the commits. Because there is a merge conflict when I was trying git merge --squash locally.

@yabinc yabinc merged commit 7a086e1 into llvm:main Sep 25, 2024
12 of 13 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 25, 2024

LLVM Buildbot has detected a new failure on builder clang-cmake-x86_64-avx512-linux running on avx512-intel64 while building clang at step 13 "test-suite".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/133/builds/4196

Here is the relevant piece of the build log for the reference
Step 13 (test-suite) failure: test (failure)
...
cd /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/tramp3d-v4 && /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1.install/bin/llvm-size --format=sysv /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4 > /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.size
cd /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/tramp3d-v4 && /usr/bin/cmake -E create_symlink /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/test-suite/MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.reference_output /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.reference_output
gmake[2]: Leaving directory '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build'
[ 36%] Built target tramp3d-v4
gmake[1]: Leaving directory '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build'
gmake: *** [Makefile:136: all] Error 2
2024-09-24 19:12:47 INFO: Testing...
2024-09-24 19:12:47 INFO: Execute: /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/stage1/bin/llvm-lit -v -j 32 /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build -o /localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/outputc9h6rhng.json
-- Testing: 2514 tests, 32 workers --
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_blendvps_253.test (1 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_blendvps_253.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_blendvps_253' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_blendvps_299.test (2 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_blendvps_299.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_blendvps_299' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_addpd_205.test (3 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_addpd_205.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_addpd_205' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_addpd_190.test (4 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_addpd_190.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_addpd_190' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_197.test (5 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_197.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_197' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_addps_117.test (6 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_addps_117.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_addps_117' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_212.test (7 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_212.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_212' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_addps_165.test (8 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_addps_165.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_addps_165' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_227.test (9 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_227.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_cmpeqpd_227' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_addpd_220.test (10 of 2514)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_addpd_220.test' FAILED ********************
Executable '/localdisk2/buildbot/llvm-worker/clang-cmake-x86_64-avx512-linux/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_addpd_220' is missing
********************

@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 25, 2024

LLVM Buildbot has detected a new failure on builder clang-armv7-lnt running on linaro-clang-armv7-lnt while building clang at step 12 "test-suite".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/77/builds/3721

Here is the relevant piece of the build log for the reference
Step 12 (test-suite) failure: test (failure)
...
cd /home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/SingleSource/UnitTests/Vector && /usr/local/bin/cmake -E create_symlink /home/tcwg-buildbot/worker/clang-armv7-lnt/test/test-suite/SingleSource/UnitTests/Vector/constpool.reference_output /home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/SingleSource/UnitTests/Vector/constpool.reference_output
gmake[2]: Leaving directory '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build'
[ 23%] Built target Vector-constpool
gmake[1]: Leaving directory '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build'
gmake: *** [Makefile:136: all] Error 2

2024-09-25 02:32:56 INFO: Testing...
2024-09-25 02:32:56 INFO: Execute: /home/tcwg-buildbot/worker/clang-armv7-lnt/stage1/bin/llvm-lit -v -j 32 /home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build -o /home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/outputtny53rq4.json
-- Testing: 3423 tests, 32 workers --
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s16_3.test (1 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s16_3.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s16_3' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s16_331.test (2 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s16_331.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s16_331' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s16_662.test (3 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s16_662.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s16_662' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s16_991.test (4 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s16_991.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s16_991' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s32_333.test (5 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s32_333.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s32_333' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s32_5.test (6 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s32_5.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s32_5' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s32_664.test (7 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s32_664.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s32_664' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s32_993.test (8 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s32_993.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s32_993' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s8_1.test (9 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s8_1.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s8_1' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s8_329.test (10 of 3423)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_vaba_s8_329.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-armv7-lnt/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_vaba_s8_329' is missing
********************

efriedma-quic added a commit that referenced this pull request Sep 25, 2024
… in C" (#109898)

Reverts #97121

Causing failures on LNT bots; log shows a crash in
ConstStructBuilder::BuildStruct.
@hubert-reinterpretcast
Copy link
Collaborator

@AaronBallman, considering the concern of potential performance regressions ("unnecessary" initialization of automatic-storage duration objects), should it be the case that there is no option control?

It was also the case that the prior Clang behaviour (with -ftrivial-auto-var-init=pattern [1]: https://godbolt.org/z/x994WMT6h) had some utility for identifying potential portability issues with a program. Option control would make me much happier about this change.

[1]:
A string of 0xaa is observed using -ftrivial-auto-var-init=pattern using Clang 19.1:

typedef union { int x; long long y; } U;

int printf(const char *, ...);
int main(void) {
  U u = { 0 };
  typedef unsigned char *ucp;
  for (ucp p = (ucp)&u; p < (ucp)&1[&u]; ++p) {
    printf("%02x", (unsigned)*p);
  }
  printf("\n");
}

@efriedma-quic
Copy link
Collaborator

(For reference, this was reapplied in #110051.)

I'm not anticipating significant performance regressions here in normal usage: all the bits that are getting initialized are padding adjacent to initialized fields, so I expect the additional initialization will usually be free, and cheap in edge cases where it isn't free. There's a chance of regressions in some edge cases, but that doesn't seem like enough to warrant an option.

Could you clarify which parts of this you'd want to disable? Handling for all variables? Handling for local variables? Handling for variables where C23 rules don't require zero-init?

@hubert-reinterpretcast
Copy link
Collaborator

Handling for local variables (not initialized with {}). The performance impact of guaranteeing the extension is artificially reduced because the baseline code generation uses memset over the entire union when the size is large, but initializing the bytes after the first member is not mandated.

@yabinc
Copy link
Contributor Author

yabinc commented Oct 16, 2024

According to the C23 standard, initializing a union variable with empty braces {} only guarantees the initialization of the first member and any padding after all members. This behavior deviates from common intuition and caused a bug when compiling the Linux kernel.
Based on the discussion in #78034, this patch aims to zero-initialize all bytes after the first member.
If this change introduces significant regressions in edge cases, a fallback strategy is to replace brace initialization with explicit assignment statements for the specific fields requiring initialization.

@efriedma-quic
Copy link
Collaborator

the baseline code generation uses memset over the entire union when the size is large

That looks like an unintended side-effect of CheckAggExprForMemSetUse: it's supposed to check how much of the initialization is non-zero, but it's assuming all padding needs to be zeroed. Which we weren't explicitly doing before, but I guess it's convenient here. And given nobody has filed a bug report for this (I think?), it doesn't seem this memset is causing significant issues for anyone.

@yabinc
Copy link
Contributor Author

yabinc commented Oct 17, 2024

I am not sure if it is worth using Store instead of memset for 8 bytes or fewer, as below:
if (BitSize <= 64) { Builder.CreateStore(Builder.getIntN(BitSize, 0), Addr); }

As shown in https://godbolt.org/z/PEW1zxc1G, memset is not optimized away in -O2 -emit-llvm, but it is optimized to stores in the arm64/x86_64 backend with -O2.

@efriedma-quic
Copy link
Collaborator

There are two possible alternative sequences we could generate there:

int test(int x, int y) {
    struct S s;
    __builtin_memset(&s, 0, 8);
    s.x = x;
    s.y = y;
    f(&s);
}
int test2(int x, int y) {
    struct S s;
    long long dummy = (unsigned char)x;
    __builtin_memcpy(&s, &dummy, 8);
    s.x = x;
    s.y = y;
    f(&s);
}

Whether one of them is better might depend on the context... for test2, the zero-extend can be folded away in some cases.

I'd prefer to handle this on the LLVM side if possible (memcpyopt, or something like that); it's probably tricky to catch all the relevant cases in clang.

@efriedma-quic
Copy link
Collaborator

(There's basically no benefit to just directly replacing a memset with an equivalent store; most relevant LLVM optimizations are memset-aware.)

@yabinc
Copy link
Contributor Author

yabinc commented Oct 17, 2024

I see. Thanks for the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The linux kernel expects initializers to FULLY initialize variables". Padding, other union members
8 participants