Skip to content

[ARM] Empty structs are 1-byte for C++ ABI #124762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 31, 2025

Conversation

ostannard
Copy link
Collaborator

For C++ (but not C), empty structs should be passed to functions as if they are a 1 byte object with 1 byte alignment.

This is defined in Arm's CPPABI32:
https://github.com/ARM-software/abi-aa/blob/main/cppabi32/cppabi32.rst
For the purposes of parameter passing in AAPCS32, a parameter whose
type is an empty class shall be treated as if its type were an
aggregate with a single member of type unsigned byte.

The AArch64 equivalent of this has an exception for structs containing an array of size zero, I've kept that logic for ARM. I've not found a reason for this exception, but I've checked that GCC does have the same behaviour for ARM as it does for AArch64.

The AArch64 version has an Apple ABI with different rules, which ignores empty structs in both C and C++. This is documented at https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms. The ARM equivalent of that appears to be AAPCS16_VFP, used for WatchOS, but I can't find any documentation for that ABI, so I'm not sure what rules it should follow. For now I've left it following the AArch64 Apple rules.

For C++ (but not C), empty structs should be passed to functions as if
they are a 1 byte object with 1 byte alignment.

This is defined in Arm's CPPABI32:
  https://github.com/ARM-software/abi-aa/blob/main/cppabi32/cppabi32.rst
  For the purposes of parameter passing in AAPCS32, a parameter whose
  type is an empty class shall be treated as if its type were an
  aggregate with a single member of type unsigned byte.

The AArch64 equivalent of this has an exception for structs containing
an array of size zero, I've kept that logic for ARM. I've not found a
reason for this exception, but I've checked that GCC does have the same
behaviour for ARM as it does for AArch64.

The AArch64 version has an Apple ABI with different rules, which ignores
empty structs in both C and C++. This is documented at
https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms.
The ARM equivalent of that appears to be AAPCS16_VFP, used for WatchOS,
but I can't find any documentation for that ABI, so I'm not sure what
rules it should follow. For now I've left it following the AArch64 Apple
rules.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:ARM clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jan 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-backend-arm

@llvm/pr-subscribers-clang

Author: Oliver Stannard (ostannard)

Changes

For C++ (but not C), empty structs should be passed to functions as if they are a 1 byte object with 1 byte alignment.

This is defined in Arm's CPPABI32:
https://github.com/ARM-software/abi-aa/blob/main/cppabi32/cppabi32.rst
For the purposes of parameter passing in AAPCS32, a parameter whose
type is an empty class shall be treated as if its type were an
aggregate with a single member of type unsigned byte.

The AArch64 equivalent of this has an exception for structs containing an array of size zero, I've kept that logic for ARM. I've not found a reason for this exception, but I've checked that GCC does have the same behaviour for ARM as it does for AArch64.

The AArch64 version has an Apple ABI with different rules, which ignores empty structs in both C and C++. This is documented at https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms. The ARM equivalent of that appears to be AAPCS16_VFP, used for WatchOS, but I can't find any documentation for that ABI, so I'm not sure what rules it should follow. For now I've left it following the AArch64 Apple rules.


Full diff: https://github.com/llvm/llvm-project/pull/124762.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/Targets/ARM.cpp (+37-6)
  • (added) clang/test/CodeGen/arm-empty-args.cpp (+119)
diff --git a/clang/lib/CodeGen/Targets/ARM.cpp b/clang/lib/CodeGen/Targets/ARM.cpp
index 2d858fa2f3c3a3..b243ccacc2155d 100644
--- a/clang/lib/CodeGen/Targets/ARM.cpp
+++ b/clang/lib/CodeGen/Targets/ARM.cpp
@@ -71,6 +71,7 @@ class ARMABIInfo : public ABIInfo {
                                   unsigned functionCallConv) const;
   ABIArgInfo classifyHomogeneousAggregate(QualType Ty, const Type *Base,
                                           uint64_t Members) const;
+  bool shouldIgnoreEmptyArg(QualType Ty) const;
   ABIArgInfo coerceIllegalVector(QualType Ty) const;
   bool isIllegalVectorType(QualType Ty) const;
   bool containsAnyFP16Vectors(QualType Ty) const;
@@ -328,6 +329,26 @@ ABIArgInfo ARMABIInfo::classifyHomogeneousAggregate(QualType Ty,
   return ABIArgInfo::getDirect(nullptr, 0, nullptr, false, Align);
 }
 
+bool ARMABIInfo::shouldIgnoreEmptyArg(QualType Ty) const {
+  uint64_t Size = getContext().getTypeSize(Ty);
+  assert((isEmptyRecord(getContext(), Ty, true) || Size == 0) &&
+         "Arg is not empty");
+
+  // Empty records are ignored in C mode, and in C++ on WatchOS.
+  if (!getContext().getLangOpts().CPlusPlus ||
+      getABIKind() == ARMABIKind::AAPCS16_VFP)
+    return true;
+
+  // In C++ mode, arguments which have sizeof() == 0 are ignored. This is not a
+  // situation which is defined by any C++ standard or ABI, but this matches
+  // GCC's de facto ABI.
+  if (Size == 0)
+    return true;
+
+  // Otherwise, they are passed as if they have a size of 1 byte.
+  return false;
+}
+
 ABIArgInfo ARMABIInfo::classifyArgumentType(QualType Ty, bool isVariadic,
                                             unsigned functionCallConv) const {
   // 6.1.2.1 The following argument types are VFP CPRCs:
@@ -366,9 +387,15 @@ ABIArgInfo ARMABIInfo::classifyArgumentType(QualType Ty, bool isVariadic,
     return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
   }
 
-  // Ignore empty records.
-  if (isEmptyRecord(getContext(), Ty, true))
-    return ABIArgInfo::getIgnore();
+  // Empty records are either ignored completely or passed as if they were a
+  // 1-byte object, depending on the ABI and language standard.
+  if (isEmptyRecord(getContext(), Ty, true) ||
+      getContext().getTypeSize(Ty) == 0) {
+    if (shouldIgnoreEmptyArg(Ty))
+      return ABIArgInfo::getIgnore();
+    else
+      return ABIArgInfo::getDirect(llvm::Type::getInt8Ty(getVMContext()));
+  }
 
   if (IsAAPCS_VFP) {
     // Homogeneous Aggregates need to be expanded when we can fit the aggregate
@@ -588,7 +615,8 @@ ABIArgInfo ARMABIInfo::classifyReturnType(QualType RetTy, bool isVariadic,
 
   // Otherwise this is an AAPCS variant.
 
-  if (isEmptyRecord(getContext(), RetTy, true))
+  if (isEmptyRecord(getContext(), RetTy, true) ||
+      getContext().getTypeSize(RetTy) == 0)
     return ABIArgInfo::getIgnore();
 
   // Check for homogeneous aggregates with AAPCS-VFP.
@@ -752,10 +780,13 @@ RValue ARMABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
   CharUnits SlotSize = CharUnits::fromQuantity(4);
 
   // Empty records are ignored for parameter passing purposes.
-  if (isEmptyRecord(getContext(), Ty, true))
+  uint64_t Size = getContext().getTypeSize(Ty);
+  bool IsEmpty = isEmptyRecord(getContext(), Ty, true);
+  if ((IsEmpty || Size == 0) && shouldIgnoreEmptyArg(Ty))
     return Slot.asRValue();
 
-  CharUnits TySize = getContext().getTypeSizeInChars(Ty);
+  CharUnits TySize =
+      std::max(getContext().getTypeSizeInChars(Ty), CharUnits::fromQuantity(1));
   CharUnits TyAlignForABI = getContext().getTypeUnadjustedAlignInChars(Ty);
 
   // Use indirect if size of the illegal vector is bigger than 16 bytes.
diff --git a/clang/test/CodeGen/arm-empty-args.cpp b/clang/test/CodeGen/arm-empty-args.cpp
new file mode 100644
index 00000000000000..5770ce160749f7
--- /dev/null
+++ b/clang/test/CodeGen/arm-empty-args.cpp
@@ -0,0 +1,119 @@
+// RUN: %clang_cc1 -triple armv7a-linux-gnueabi -emit-llvm -o - -x c %s | FileCheck %s --check-prefixes=CHECK,C
+// RUN: %clang_cc1 -triple armv7a-linux-gnueabi -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CXX
+// RUN: %clang_cc1 -triple thumbv7k-apple-watchos2.0 -target-abi aapcs16 -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,WATCHOS
+
+// Empty structs are ignored for PCS purposes on WatchOS and in C mode
+// elsewhere.  In C++ mode they consume a register slot though. Functions are
+// slightly bigger than minimal to make confirmation against actual GCC
+// behaviour easier.
+
+#if __cplusplus
+#define EXTERNC extern "C"
+#else
+#define EXTERNC
+#endif
+
+struct Empty {};
+
+// C: define{{.*}} i32 @empty_arg(i32 noundef %a)
+// CXX: define{{.*}} i32 @empty_arg(i8 %e.coerce, i32 noundef %a)
+// WATCHOS: define{{.*}} i32 @empty_arg(i32 noundef %a)
+EXTERNC int empty_arg(struct Empty e, int a) {
+  return a;
+}
+
+// C: define{{.*}} void @empty_ret()
+// CXX: define{{.*}} void @empty_ret()
+// WATCHOS: define{{.*}} void @empty_ret()
+EXTERNC struct Empty empty_ret(void) {
+  struct Empty e;
+  return e;
+}
+
+// However, what counts as "empty" is a baroque mess. This is super-empty, it's
+// ignored even in C++ mode. It also has sizeof == 0, violating C++, but that's
+// legacy for you:
+
+struct SuperEmpty {
+  int arr[0];
+};
+
+// C: define{{.*}} i32 @super_empty_arg(i32 noundef %a)
+// CXX: define{{.*}} i32 @super_empty_arg(i32 noundef %a)
+// WATCHOS: define{{.*}} i32 @super_empty_arg(i32 noundef %a)
+EXTERNC int super_empty_arg(struct SuperEmpty e, int a) {
+  return a;
+}
+
+struct SortOfEmpty {
+  struct SuperEmpty e;
+};
+
+// C: define{{.*}} i32 @sort_of_empty_arg(i32 noundef %a)
+// CXX: define{{.*}} i32 @sort_of_empty_arg(i8 %e.coerce, i32 noundef %a)
+// WATCHOS: define{{.*}} i32 @sort_of_empty_arg(i32 noundef %a)
+EXTERNC int sort_of_empty_arg(struct Empty e, int a) {
+  return a;
+}
+
+// C: define{{.*}} void @sort_of_empty_ret()
+// CXX: define{{.*}} void @sort_of_empty_ret()
+// WATCHOS: define{{.*}} void @sort_of_empty_ret()
+EXTERNC struct SortOfEmpty sort_of_empty_ret(void) {
+  struct SortOfEmpty e;
+  return e;
+}
+
+#include <stdarg.h>
+
+// va_arg matches the above rules, consuming an incoming argument in cases
+// where one would be passed, and not doing so when the argument should be
+// ignored.
+
+EXTERNC int empty_arg_variadic(int a, ...) {
+// CHECK-LABEL: @empty_arg_variadic(
+// C: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// C-NOT: {{ getelementptr }}
+// CXX: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// CXX: %argp.next2 = getelementptr inbounds i8, ptr %argp.cur1, i32 4
+// WATCHOS: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// WATCHOS-NOT: {{ getelementptr }}
+  va_list vl;
+  va_start(vl, a);
+  struct Empty b = va_arg(vl, struct Empty);
+  int c = va_arg(vl, int);
+  va_end(vl);
+  return c;
+}
+
+EXTERNC int super_empty_arg_variadic(int a, ...) {
+// CHECK-LABEL: @super_empty_arg_variadic(
+// C: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// C-NOT: {{ getelementptr }}
+// CXX: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// CXX-NOT: {{ getelementptr }}
+// WATCHOS: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// WATCHOS-NOT: {{ getelementptr }}
+  va_list vl;
+  va_start(vl, a);
+  struct SuperEmpty b = va_arg(vl, struct SuperEmpty);
+  int c = va_arg(vl, int);
+  va_end(vl);
+  return c;
+}
+
+EXTERNC int sort_of_empty_arg_variadic(int a, ...) {
+// CHECK-LABEL: @sort_of_empty_arg_variadic(
+// C: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// C-NOT: {{ getelementptr }}
+// CXX: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// CXX-NOT: {{ getelementptr }}
+// WATCHOS: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// WATCHOS-NOT: {{ getelementptr }}
+  va_list vl;
+  va_start(vl, a);
+  struct SortOfEmpty b = va_arg(vl, struct SortOfEmpty);
+  int c = va_arg(vl, int);
+  va_end(vl);
+  return c;
+}

@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-clang-codegen

Author: Oliver Stannard (ostannard)

Changes

For C++ (but not C), empty structs should be passed to functions as if they are a 1 byte object with 1 byte alignment.

This is defined in Arm's CPPABI32:
https://github.com/ARM-software/abi-aa/blob/main/cppabi32/cppabi32.rst
For the purposes of parameter passing in AAPCS32, a parameter whose
type is an empty class shall be treated as if its type were an
aggregate with a single member of type unsigned byte.

The AArch64 equivalent of this has an exception for structs containing an array of size zero, I've kept that logic for ARM. I've not found a reason for this exception, but I've checked that GCC does have the same behaviour for ARM as it does for AArch64.

The AArch64 version has an Apple ABI with different rules, which ignores empty structs in both C and C++. This is documented at https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms. The ARM equivalent of that appears to be AAPCS16_VFP, used for WatchOS, but I can't find any documentation for that ABI, so I'm not sure what rules it should follow. For now I've left it following the AArch64 Apple rules.


Full diff: https://github.com/llvm/llvm-project/pull/124762.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/Targets/ARM.cpp (+37-6)
  • (added) clang/test/CodeGen/arm-empty-args.cpp (+119)
diff --git a/clang/lib/CodeGen/Targets/ARM.cpp b/clang/lib/CodeGen/Targets/ARM.cpp
index 2d858fa2f3c3a3..b243ccacc2155d 100644
--- a/clang/lib/CodeGen/Targets/ARM.cpp
+++ b/clang/lib/CodeGen/Targets/ARM.cpp
@@ -71,6 +71,7 @@ class ARMABIInfo : public ABIInfo {
                                   unsigned functionCallConv) const;
   ABIArgInfo classifyHomogeneousAggregate(QualType Ty, const Type *Base,
                                           uint64_t Members) const;
+  bool shouldIgnoreEmptyArg(QualType Ty) const;
   ABIArgInfo coerceIllegalVector(QualType Ty) const;
   bool isIllegalVectorType(QualType Ty) const;
   bool containsAnyFP16Vectors(QualType Ty) const;
@@ -328,6 +329,26 @@ ABIArgInfo ARMABIInfo::classifyHomogeneousAggregate(QualType Ty,
   return ABIArgInfo::getDirect(nullptr, 0, nullptr, false, Align);
 }
 
+bool ARMABIInfo::shouldIgnoreEmptyArg(QualType Ty) const {
+  uint64_t Size = getContext().getTypeSize(Ty);
+  assert((isEmptyRecord(getContext(), Ty, true) || Size == 0) &&
+         "Arg is not empty");
+
+  // Empty records are ignored in C mode, and in C++ on WatchOS.
+  if (!getContext().getLangOpts().CPlusPlus ||
+      getABIKind() == ARMABIKind::AAPCS16_VFP)
+    return true;
+
+  // In C++ mode, arguments which have sizeof() == 0 are ignored. This is not a
+  // situation which is defined by any C++ standard or ABI, but this matches
+  // GCC's de facto ABI.
+  if (Size == 0)
+    return true;
+
+  // Otherwise, they are passed as if they have a size of 1 byte.
+  return false;
+}
+
 ABIArgInfo ARMABIInfo::classifyArgumentType(QualType Ty, bool isVariadic,
                                             unsigned functionCallConv) const {
   // 6.1.2.1 The following argument types are VFP CPRCs:
@@ -366,9 +387,15 @@ ABIArgInfo ARMABIInfo::classifyArgumentType(QualType Ty, bool isVariadic,
     return getNaturalAlignIndirect(Ty, RAA == CGCXXABI::RAA_DirectInMemory);
   }
 
-  // Ignore empty records.
-  if (isEmptyRecord(getContext(), Ty, true))
-    return ABIArgInfo::getIgnore();
+  // Empty records are either ignored completely or passed as if they were a
+  // 1-byte object, depending on the ABI and language standard.
+  if (isEmptyRecord(getContext(), Ty, true) ||
+      getContext().getTypeSize(Ty) == 0) {
+    if (shouldIgnoreEmptyArg(Ty))
+      return ABIArgInfo::getIgnore();
+    else
+      return ABIArgInfo::getDirect(llvm::Type::getInt8Ty(getVMContext()));
+  }
 
   if (IsAAPCS_VFP) {
     // Homogeneous Aggregates need to be expanded when we can fit the aggregate
@@ -588,7 +615,8 @@ ABIArgInfo ARMABIInfo::classifyReturnType(QualType RetTy, bool isVariadic,
 
   // Otherwise this is an AAPCS variant.
 
-  if (isEmptyRecord(getContext(), RetTy, true))
+  if (isEmptyRecord(getContext(), RetTy, true) ||
+      getContext().getTypeSize(RetTy) == 0)
     return ABIArgInfo::getIgnore();
 
   // Check for homogeneous aggregates with AAPCS-VFP.
@@ -752,10 +780,13 @@ RValue ARMABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
   CharUnits SlotSize = CharUnits::fromQuantity(4);
 
   // Empty records are ignored for parameter passing purposes.
-  if (isEmptyRecord(getContext(), Ty, true))
+  uint64_t Size = getContext().getTypeSize(Ty);
+  bool IsEmpty = isEmptyRecord(getContext(), Ty, true);
+  if ((IsEmpty || Size == 0) && shouldIgnoreEmptyArg(Ty))
     return Slot.asRValue();
 
-  CharUnits TySize = getContext().getTypeSizeInChars(Ty);
+  CharUnits TySize =
+      std::max(getContext().getTypeSizeInChars(Ty), CharUnits::fromQuantity(1));
   CharUnits TyAlignForABI = getContext().getTypeUnadjustedAlignInChars(Ty);
 
   // Use indirect if size of the illegal vector is bigger than 16 bytes.
diff --git a/clang/test/CodeGen/arm-empty-args.cpp b/clang/test/CodeGen/arm-empty-args.cpp
new file mode 100644
index 00000000000000..5770ce160749f7
--- /dev/null
+++ b/clang/test/CodeGen/arm-empty-args.cpp
@@ -0,0 +1,119 @@
+// RUN: %clang_cc1 -triple armv7a-linux-gnueabi -emit-llvm -o - -x c %s | FileCheck %s --check-prefixes=CHECK,C
+// RUN: %clang_cc1 -triple armv7a-linux-gnueabi -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,CXX
+// RUN: %clang_cc1 -triple thumbv7k-apple-watchos2.0 -target-abi aapcs16 -emit-llvm -o - %s | FileCheck %s --check-prefixes=CHECK,WATCHOS
+
+// Empty structs are ignored for PCS purposes on WatchOS and in C mode
+// elsewhere.  In C++ mode they consume a register slot though. Functions are
+// slightly bigger than minimal to make confirmation against actual GCC
+// behaviour easier.
+
+#if __cplusplus
+#define EXTERNC extern "C"
+#else
+#define EXTERNC
+#endif
+
+struct Empty {};
+
+// C: define{{.*}} i32 @empty_arg(i32 noundef %a)
+// CXX: define{{.*}} i32 @empty_arg(i8 %e.coerce, i32 noundef %a)
+// WATCHOS: define{{.*}} i32 @empty_arg(i32 noundef %a)
+EXTERNC int empty_arg(struct Empty e, int a) {
+  return a;
+}
+
+// C: define{{.*}} void @empty_ret()
+// CXX: define{{.*}} void @empty_ret()
+// WATCHOS: define{{.*}} void @empty_ret()
+EXTERNC struct Empty empty_ret(void) {
+  struct Empty e;
+  return e;
+}
+
+// However, what counts as "empty" is a baroque mess. This is super-empty, it's
+// ignored even in C++ mode. It also has sizeof == 0, violating C++, but that's
+// legacy for you:
+
+struct SuperEmpty {
+  int arr[0];
+};
+
+// C: define{{.*}} i32 @super_empty_arg(i32 noundef %a)
+// CXX: define{{.*}} i32 @super_empty_arg(i32 noundef %a)
+// WATCHOS: define{{.*}} i32 @super_empty_arg(i32 noundef %a)
+EXTERNC int super_empty_arg(struct SuperEmpty e, int a) {
+  return a;
+}
+
+struct SortOfEmpty {
+  struct SuperEmpty e;
+};
+
+// C: define{{.*}} i32 @sort_of_empty_arg(i32 noundef %a)
+// CXX: define{{.*}} i32 @sort_of_empty_arg(i8 %e.coerce, i32 noundef %a)
+// WATCHOS: define{{.*}} i32 @sort_of_empty_arg(i32 noundef %a)
+EXTERNC int sort_of_empty_arg(struct Empty e, int a) {
+  return a;
+}
+
+// C: define{{.*}} void @sort_of_empty_ret()
+// CXX: define{{.*}} void @sort_of_empty_ret()
+// WATCHOS: define{{.*}} void @sort_of_empty_ret()
+EXTERNC struct SortOfEmpty sort_of_empty_ret(void) {
+  struct SortOfEmpty e;
+  return e;
+}
+
+#include <stdarg.h>
+
+// va_arg matches the above rules, consuming an incoming argument in cases
+// where one would be passed, and not doing so when the argument should be
+// ignored.
+
+EXTERNC int empty_arg_variadic(int a, ...) {
+// CHECK-LABEL: @empty_arg_variadic(
+// C: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// C-NOT: {{ getelementptr }}
+// CXX: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// CXX: %argp.next2 = getelementptr inbounds i8, ptr %argp.cur1, i32 4
+// WATCHOS: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// WATCHOS-NOT: {{ getelementptr }}
+  va_list vl;
+  va_start(vl, a);
+  struct Empty b = va_arg(vl, struct Empty);
+  int c = va_arg(vl, int);
+  va_end(vl);
+  return c;
+}
+
+EXTERNC int super_empty_arg_variadic(int a, ...) {
+// CHECK-LABEL: @super_empty_arg_variadic(
+// C: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// C-NOT: {{ getelementptr }}
+// CXX: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// CXX-NOT: {{ getelementptr }}
+// WATCHOS: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// WATCHOS-NOT: {{ getelementptr }}
+  va_list vl;
+  va_start(vl, a);
+  struct SuperEmpty b = va_arg(vl, struct SuperEmpty);
+  int c = va_arg(vl, int);
+  va_end(vl);
+  return c;
+}
+
+EXTERNC int sort_of_empty_arg_variadic(int a, ...) {
+// CHECK-LABEL: @sort_of_empty_arg_variadic(
+// C: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// C-NOT: {{ getelementptr }}
+// CXX: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// CXX-NOT: {{ getelementptr }}
+// WATCHOS: %argp.next = getelementptr inbounds i8, ptr %argp.cur, i32 4
+// WATCHOS-NOT: {{ getelementptr }}
+  va_list vl;
+  va_start(vl, a);
+  struct SortOfEmpty b = va_arg(vl, struct SortOfEmpty);
+  int c = va_arg(vl, int);
+  va_end(vl);
+  return c;
+}

Comment on lines +618 to +619
if (isEmptyRecord(getContext(), RetTy, true) ||
getContext().getTypeSize(RetTy) == 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change actually have any effect?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for the SortOfEmpty case, which isn't considered empty, but does have size 0.

return Slot.asRValue();

CharUnits TySize = getContext().getTypeSizeInChars(Ty);
CharUnits TySize =
std::max(getContext().getTypeSizeInChars(Ty), CharUnits::fromQuantity(1));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get here, we know the size isn't zero, I think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left over from a previous version, I'll remove it.

// 1-byte object, depending on the ABI and language standard.
if (isEmptyRecord(getContext(), Ty, true) ||
getContext().getTypeSize(Ty) == 0) {
if (shouldIgnoreEmptyArg(Ty))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer to write this something like:

if (size == zero)
  return getIgnore();
if (isempty() && isWatchOS())
  return getIgnore();

This should be equivalent because an empty struct in C is size zero anyway.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, hmm... do we want to add fclang-abi-compat support for this change, or do we think it's too niche to matter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of the -fclang-abi-compat option, but I think that would be worth adding for this. I'll also add a release note as suggested in #124760.

@llvmbot llvmbot added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Jan 29, 2025
- The AArch64 calling convention for empty structs in C++ mode was changed to
pass them as if they have a size of 1 byte, matching the AAPCS64
specification and GCC's implementation. The previous behaviour of ignoring
the argument can be restored using the -fclang-abi-compat=20 (or earlier)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20? You're not planning to cherry-pick this to the release branch?

@ostannard
Copy link
Collaborator Author

That last commit message should have been "llvm-20", not 19.

@@ -1144,6 +1144,12 @@ Arm and AArch64 Support

* FUJITSU-MONAKA (fujitsu-monaka)

- The AArch64 calling convention for empty structs in C++ mode was changed to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is supposed to mention AArch32/AAPCS32?

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ostannard
Copy link
Collaborator Author

The release notes have now been cleared on main, so I'll re-add them in the llvm-20 cherry pick.

@ostannard ostannard merged commit 97b066f into llvm:main Jan 31, 2025
5 of 8 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder openmp-offload-amdgpu-runtime running on omp-vega20-0 while building clang at step 7 "Add check check-offload".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/30/builds/14992

Here is the relevant piece of the build log for the reference
Step 7 (Add check check-offload) failure: test (failure)
...
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug53727.cpp (999 of 1008)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug50022.cpp (1000 of 1008)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/test_libc.cpp (1001 of 1008)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/wtime.c (1002 of 1008)
PASS: libomptarget :: x86_64-unknown-linux-gnu :: offloading/bug49021.cpp (1003 of 1008)
PASS: libomptarget :: x86_64-unknown-linux-gnu :: offloading/std_complex_arithmetic.cpp (1004 of 1008)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/complex_reduction.cpp (1005 of 1008)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/bug49021.cpp (1006 of 1008)
PASS: libomptarget :: x86_64-unknown-linux-gnu-LTO :: offloading/std_complex_arithmetic.cpp (1007 of 1008)
TIMEOUT: libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp (1008 of 1008)
******************** TEST 'libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp' FAILED ********************
Exit Code: -9
Timeout: Reached timeout of 100 seconds

Command Output (stdout):
--
# RUN: at line 1
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# note: command had no output on stdout or stderr
# RUN: at line 2
/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp    -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src  -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib  -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a && /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp | /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/clang++ -fopenmp -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test -I /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -L /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -nogpulib -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/openmp/runtime/src -Wl,-rpath,/home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib -fopenmp-targets=amdgcn-amd-amdhsa -O3 /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp -o /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./lib/libomptarget.devicertl.a
# note: command had no output on stdout or stderr
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/ctor_dtor.cpp.tmp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -9
# error: command reached timeout: True
# executed command: /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.build/./bin/FileCheck /home/ompworker/bbot/openmp-offload-amdgpu-runtime/llvm.src/offload/test/offloading/ctor_dtor.cpp
# note: command had no output on stdout or stderr
# error: command failed with exit status: -9
# error: command reached timeout: True

--

********************
Slowest Tests:
--------------------------------------------------------------------------
100.05s: libomptarget :: amdgcn-amd-amdhsa :: offloading/ctor_dtor.cpp
16.75s: libomptarget :: amdgcn-amd-amdhsa :: offloading/bug49021.cpp
13.21s: libomptarget :: amdgcn-amd-amdhsa :: offloading/parallel_target_teams_reduction_max.cpp
12.98s: libomptarget :: amdgcn-amd-amdhsa :: offloading/parallel_target_teams_reduction_min.cpp
10.75s: libomptarget :: amdgcn-amd-amdhsa :: offloading/complex_reduction.cpp
9.40s: libomptarget :: amdgcn-amd-amdhsa :: jit/empty_kernel_lvl2.c
9.28s: libomptarget :: x86_64-unknown-linux-gnu :: offloading/bug49021.cpp

@ostannard
Copy link
Collaborator Author

/cherry-pick 97b066f

@llvmbot
Copy link
Member

llvmbot commented Jan 31, 2025

/cherry-pick 97b066f

Error: Command failed due to missing milestone.

@ostannard ostannard added this to the LLVM 20.X Release milestone Jan 31, 2025
@ostannard
Copy link
Collaborator Author

/cherry-pick 97b066f

@llvmbot
Copy link
Member

llvmbot commented Jan 31, 2025

/pull-request #125191

ostannard added a commit to ostannard/llvm-project that referenced this pull request Jan 31, 2025
For C++ (but not C), empty structs should be passed to functions as if
they are a 1 byte object with 1 byte alignment.

This is defined in Arm's CPPABI32:
  https://github.com/ARM-software/abi-aa/blob/main/cppabi32/cppabi32.rst
  For the purposes of parameter passing in AAPCS32, a parameter whose
  type is an empty class shall be treated as if its type were an
  aggregate with a single member of type unsigned byte.

The AArch64 equivalent of this has an exception for structs containing
an array of size zero, I've kept that logic for ARM. I've not found a
reason for this exception, but I've checked that GCC does have the same
behaviour for ARM as it does for AArch64.

The AArch64 version has an Apple ABI with different rules, which ignores
empty structs in both C and C++. This is documented at
https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms.
The ARM equivalent of that appears to be AAPCS16_VFP, used for WatchOS,
but I can't find any documentation for that ABI, so I'm not sure what
rules it should follow. For now I've left it following the AArch64 Apple
rules.
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Feb 10, 2025
For C++ (but not C), empty structs should be passed to functions as if
they are a 1 byte object with 1 byte alignment.

This is defined in Arm's CPPABI32:
  https://github.com/ARM-software/abi-aa/blob/main/cppabi32/cppabi32.rst
  For the purposes of parameter passing in AAPCS32, a parameter whose
  type is an empty class shall be treated as if its type were an
  aggregate with a single member of type unsigned byte.

The AArch64 equivalent of this has an exception for structs containing
an array of size zero, I've kept that logic for ARM. I've not found a
reason for this exception, but I've checked that GCC does have the same
behaviour for ARM as it does for AArch64.

The AArch64 version has an Apple ABI with different rules, which ignores
empty structs in both C and C++. This is documented at
https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms.
The ARM equivalent of that appears to be AAPCS16_VFP, used for WatchOS,
but I can't find any documentation for that ABI, so I'm not sure what
rules it should follow. For now I've left it following the AArch64 Apple
rules.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:ARM clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
Development

Successfully merging this pull request may close these issues.

4 participants