[AArch64] Change the coercion type of structs with pointer members. #135064

davemgreen · 2025-04-09T18:21:29Z

The aim here is to avoid a ptrtoint->inttoptr round-trip through the function
argument whilst keeping the calling convention the same. Given a struct which
is <= 128bits in size, which can only contain either 1 or 2 pointers, we
convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or
[2 x i64]. This helps alias analysis produce more accurate results.

llvmbot · 2025-04-09T18:22:04Z

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

The aim here is to avoid a ptrtoint->inttoptr round-trip through the function
argument whilst keeping the calling convention the same. Given a struct which
is <= 128bits in size, which can only contain either 1 or 2 pointers, we
convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or
[2 x i64]. This helps alias analysis produce more accurate results.

Full diff: https://github.com/llvm/llvm-project/pull/135064.diff

3 Files Affected:

(modified) clang/lib/CodeGen/Targets/AArch64.cpp (+18)
(added) clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp (+93)
(modified) clang/test/CodeGenCXX/trivial_abi.cpp (+5-8)

diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp
index 073ca3cc82690..9dc5f824254eb 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -485,6 +485,24 @@ ABIArgInfo AArch64ABIInfo::classifyArgumentType(QualType Ty, bool IsVariadicFn,
     }
     Size = llvm::alignTo(Size, Alignment);
 
+    // If the Aggregate is made up of pointers, use an array of pointers for the
+    // coerced type. This prevents having to convert ptr2int->int2ptr through
+    // the call, allowing alias analysis to produce better code.
+    if (const RecordType *RT = Ty->getAs<RecordType>()) {
+      if (const RecordDecl *RD = RT->getDecl()) {
+        if (all_of(RD->fields(), [](FieldDecl *FD) {
+              return FD->getType()->isPointerOrReferenceType();
+            })) {
+          assert((Size == 64 || Size == 128) &&
+                 "Expected a 64 or 128bit struct containing pointers");
+          llvm::Type *PtrTy = llvm::PointerType::getUnqual(getVMContext());
+          if (Size == 128)
+            PtrTy = llvm::ArrayType::get(PtrTy, 2);
+          return ABIArgInfo::getDirect(PtrTy);
+        }
+      }
+    }
+
     // We use a pair of i64 for 16-byte aggregate with 8-byte alignment.
     // For aggregates with 16-byte alignment, we use i128.
     llvm::Type *BaseTy = llvm::Type::getIntNTy(getVMContext(), Alignment);
diff --git a/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp b/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp
new file mode 100644
index 0000000000000..c2d68ae7ef6cd
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp
@@ -0,0 +1,93 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple aarch64-none-elf -fcxx-exceptions -fexceptions -emit-llvm -o - %s | FileCheck %s
+
+struct Sp {
+    int *x;
+};
+// CHECK-LABEL: define dso_local void @_Z2Tp2Sp(
+// CHECK-SAME: ptr [[S_COERCE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SP:%.*]], align 8
+// CHECK-NEXT:    [[COERCE_DIVE:%.*]] = getelementptr inbounds nuw [[STRUCT_SP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    store ptr [[S_COERCE]], ptr [[COERCE_DIVE]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tp(Sp s) { *s.x = 1; }
+
+struct Spp {
+    int *x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Tpp3Spp(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPP:%.*]], align 8
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpp(Spp s) { *s.x = 1; }
+
+struct Sppp {
+    int *x, *y, *z;
+};
+// CHECK-LABEL: define dso_local void @_Z4Tppp4Sppp(
+// CHECK-SAME: ptr noundef [[S:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S_INDIRECT_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[S]], ptr [[S_INDIRECT_ADDR]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPPP:%.*]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tppp(Sppp s) { *s.x = 1; }
+
+struct Spi {
+    int *x, y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Tpi3Spi(
+// CHECK-SAME: [2 x i64] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPI:%.*]], align 8
+// CHECK-NEXT:    store [2 x i64] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPI]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpi(Spi s) { *s.x = 1; }
+
+struct Srp {
+    int &x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Trp3Srp(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SRP:%.*]], align 8
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SRP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Trp(Srp s) { s.x = 1; }
+
+struct __attribute__((__packed__)) Spp_packed {
+    int *x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z10Tpp_packed10Spp_packed(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPP_PACKED:%.*]], align 1
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 1
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPP_PACKED]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 1
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpp_packed(Spp_packed s) { *s.x = 1; }
diff --git a/clang/test/CodeGenCXX/trivial_abi.cpp b/clang/test/CodeGenCXX/trivial_abi.cpp
index 90054dbf37ae3..b8cc0d1cc6528 100644
--- a/clang/test/CodeGenCXX/trivial_abi.cpp
+++ b/clang/test/CodeGenCXX/trivial_abi.cpp
@@ -68,11 +68,10 @@ struct D0 : B0, B1 {
 
 Small D0::m0() { return {}; }
 
-// CHECK: define{{.*}} void @_Z14testParamSmall5Small(i64 %[[A_COERCE:.*]])
+// CHECK: define{{.*}} void @_Z14testParamSmall5Small(ptr %[[A_COERCE:.*]])
 // CHECK: %[[A:.*]] = alloca %[[STRUCT_SMALL]], align 8
 // CHECK: %[[COERCE_DIVE:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[A]], i32 0, i32 0
-// CHECK: %[[COERCE_VAL_IP:.*]] = inttoptr i64 %[[A_COERCE]] to ptr
-// CHECK: store ptr %[[COERCE_VAL_IP]], ptr %[[COERCE_DIVE]], align 8
+// CHECK: store ptr %[[A_COERCE]], ptr %[[COERCE_DIVE]], align 8
 // CHECK: %[[CALL:.*]] = call noundef ptr @_ZN5SmallD1Ev(ptr {{[^,]*}} %[[A]])
 // CHECK: ret void
 // CHECK: }
@@ -101,8 +100,7 @@ Small testReturnSmall() {
 // CHECK: %[[CALL1:.*]] = call noundef ptr @_ZN5SmallC1ERKS_(ptr {{[^,]*}} %[[AGG_TMP]], ptr noundef nonnull align 8 dereferenceable(8) %[[T]])
 // CHECK: %[[COERCE_DIVE:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[AGG_TMP]], i32 0, i32 0
 // CHECK: %[[V0:.*]] = load ptr, ptr %[[COERCE_DIVE]], align 8
-// CHECK: %[[COERCE_VAL_PI:.*]] = ptrtoint ptr %[[V0]] to i64
-// CHECK: call void @_Z14testParamSmall5Small(i64 %[[COERCE_VAL_PI]])
+// CHECK: call void @_Z14testParamSmall5Small(ptr %[[V0]])
 // CHECK: %[[CALL2:.*]] = call noundef ptr @_ZN5SmallD1Ev(ptr {{[^,]*}} %[[T]])
 // CHECK: ret void
 // CHECK: }
@@ -120,8 +118,7 @@ void testCallSmall0() {
 // CHECK: store ptr %[[COERCE_VAL_IP]], ptr %[[COERCE_DIVE]], align 8
 // CHECK: %[[COERCE_DIVE1:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[AGG_TMP]], i32 0, i32 0
 // CHECK: %[[V0:.*]] = load ptr, ptr %[[COERCE_DIVE1]], align 8
-// CHECK: %[[COERCE_VAL_PI:.*]] = ptrtoint ptr %[[V0]] to i64
-// CHECK: call void @_Z14testParamSmall5Small(i64 %[[COERCE_VAL_PI]])
+// CHECK: call void @_Z14testParamSmall5Small(ptr %[[V0]])
 // CHECK: ret void
 // CHECK: }
 
@@ -226,7 +223,7 @@ NonTrivial testReturnHasNonTrivial() {
 // CHECK: call noundef ptr @_ZN5SmallC1Ev(ptr {{[^,]*}} %[[AGG_TMP]])
 // CHECK: invoke noundef ptr @_ZN5SmallC1Ev(ptr {{[^,]*}} %[[AGG_TMP1]])
 
-// CHECK: call void @_Z20calleeExceptionSmall5SmallS_(i64 %{{.*}}, i64 %{{.*}})
+// CHECK: call void @_Z20calleeExceptionSmall5SmallS_(ptr %{{.*}}, ptr %{{.*}})
 // CHECK-NEXT: ret void
 
 // CHECK: landingpad { ptr, i32 }

llvmbot · 2025-04-09T18:22:04Z

@llvm/pr-subscribers-clang

Author: David Green (davemgreen)

Changes

The aim here is to avoid a ptrtoint->inttoptr round-trip through the function
argument whilst keeping the calling convention the same. Given a struct which
is <= 128bits in size, which can only contain either 1 or 2 pointers, we
convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or
[2 x i64]. This helps alias analysis produce more accurate results.

Full diff: https://github.com/llvm/llvm-project/pull/135064.diff

3 Files Affected:

(modified) clang/lib/CodeGen/Targets/AArch64.cpp (+18)
(added) clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp (+93)
(modified) clang/test/CodeGenCXX/trivial_abi.cpp (+5-8)

diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp
index 073ca3cc82690..9dc5f824254eb 100644
--- a/clang/lib/CodeGen/Targets/AArch64.cpp
+++ b/clang/lib/CodeGen/Targets/AArch64.cpp
@@ -485,6 +485,24 @@ ABIArgInfo AArch64ABIInfo::classifyArgumentType(QualType Ty, bool IsVariadicFn,
     }
     Size = llvm::alignTo(Size, Alignment);
 
+    // If the Aggregate is made up of pointers, use an array of pointers for the
+    // coerced type. This prevents having to convert ptr2int->int2ptr through
+    // the call, allowing alias analysis to produce better code.
+    if (const RecordType *RT = Ty->getAs<RecordType>()) {
+      if (const RecordDecl *RD = RT->getDecl()) {
+        if (all_of(RD->fields(), [](FieldDecl *FD) {
+              return FD->getType()->isPointerOrReferenceType();
+            })) {
+          assert((Size == 64 || Size == 128) &&
+                 "Expected a 64 or 128bit struct containing pointers");
+          llvm::Type *PtrTy = llvm::PointerType::getUnqual(getVMContext());
+          if (Size == 128)
+            PtrTy = llvm::ArrayType::get(PtrTy, 2);
+          return ABIArgInfo::getDirect(PtrTy);
+        }
+      }
+    }
+
     // We use a pair of i64 for 16-byte aggregate with 8-byte alignment.
     // For aggregates with 16-byte alignment, we use i128.
     llvm::Type *BaseTy = llvm::Type::getIntNTy(getVMContext(), Alignment);
diff --git a/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp b/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp
new file mode 100644
index 0000000000000..c2d68ae7ef6cd
--- /dev/null
+++ b/clang/test/CodeGen/AArch64/struct-coerce-using-ptr.cpp
@@ -0,0 +1,93 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple aarch64-none-elf -fcxx-exceptions -fexceptions -emit-llvm -o - %s | FileCheck %s
+
+struct Sp {
+    int *x;
+};
+// CHECK-LABEL: define dso_local void @_Z2Tp2Sp(
+// CHECK-SAME: ptr [[S_COERCE:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SP:%.*]], align 8
+// CHECK-NEXT:    [[COERCE_DIVE:%.*]] = getelementptr inbounds nuw [[STRUCT_SP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    store ptr [[S_COERCE]], ptr [[COERCE_DIVE]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tp(Sp s) { *s.x = 1; }
+
+struct Spp {
+    int *x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Tpp3Spp(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPP:%.*]], align 8
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpp(Spp s) { *s.x = 1; }
+
+struct Sppp {
+    int *x, *y, *z;
+};
+// CHECK-LABEL: define dso_local void @_Z4Tppp4Sppp(
+// CHECK-SAME: ptr noundef [[S:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S_INDIRECT_ADDR:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:    store ptr [[S]], ptr [[S_INDIRECT_ADDR]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPPP:%.*]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tppp(Sppp s) { *s.x = 1; }
+
+struct Spi {
+    int *x, y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Tpi3Spi(
+// CHECK-SAME: [2 x i64] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPI:%.*]], align 8
+// CHECK-NEXT:    store [2 x i64] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPI]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpi(Spi s) { *s.x = 1; }
+
+struct Srp {
+    int &x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z3Trp3Srp(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SRP:%.*]], align 8
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 8
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SRP]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 8
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Trp(Srp s) { s.x = 1; }
+
+struct __attribute__((__packed__)) Spp_packed {
+    int *x, *y;
+};
+// CHECK-LABEL: define dso_local void @_Z10Tpp_packed10Spp_packed(
+// CHECK-SAME: [2 x ptr] [[S_COERCE:%.*]]) #[[ATTR0]] {
+// CHECK-NEXT:  [[ENTRY:.*:]]
+// CHECK-NEXT:    [[S:%.*]] = alloca [[STRUCT_SPP_PACKED:%.*]], align 1
+// CHECK-NEXT:    store [2 x ptr] [[S_COERCE]], ptr [[S]], align 1
+// CHECK-NEXT:    [[X:%.*]] = getelementptr inbounds nuw [[STRUCT_SPP_PACKED]], ptr [[S]], i32 0, i32 0
+// CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[X]], align 1
+// CHECK-NEXT:    store i32 1, ptr [[TMP0]], align 4
+// CHECK-NEXT:    ret void
+//
+void Tpp_packed(Spp_packed s) { *s.x = 1; }
diff --git a/clang/test/CodeGenCXX/trivial_abi.cpp b/clang/test/CodeGenCXX/trivial_abi.cpp
index 90054dbf37ae3..b8cc0d1cc6528 100644
--- a/clang/test/CodeGenCXX/trivial_abi.cpp
+++ b/clang/test/CodeGenCXX/trivial_abi.cpp
@@ -68,11 +68,10 @@ struct D0 : B0, B1 {
 
 Small D0::m0() { return {}; }
 
-// CHECK: define{{.*}} void @_Z14testParamSmall5Small(i64 %[[A_COERCE:.*]])
+// CHECK: define{{.*}} void @_Z14testParamSmall5Small(ptr %[[A_COERCE:.*]])
 // CHECK: %[[A:.*]] = alloca %[[STRUCT_SMALL]], align 8
 // CHECK: %[[COERCE_DIVE:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[A]], i32 0, i32 0
-// CHECK: %[[COERCE_VAL_IP:.*]] = inttoptr i64 %[[A_COERCE]] to ptr
-// CHECK: store ptr %[[COERCE_VAL_IP]], ptr %[[COERCE_DIVE]], align 8
+// CHECK: store ptr %[[A_COERCE]], ptr %[[COERCE_DIVE]], align 8
 // CHECK: %[[CALL:.*]] = call noundef ptr @_ZN5SmallD1Ev(ptr {{[^,]*}} %[[A]])
 // CHECK: ret void
 // CHECK: }
@@ -101,8 +100,7 @@ Small testReturnSmall() {
 // CHECK: %[[CALL1:.*]] = call noundef ptr @_ZN5SmallC1ERKS_(ptr {{[^,]*}} %[[AGG_TMP]], ptr noundef nonnull align 8 dereferenceable(8) %[[T]])
 // CHECK: %[[COERCE_DIVE:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[AGG_TMP]], i32 0, i32 0
 // CHECK: %[[V0:.*]] = load ptr, ptr %[[COERCE_DIVE]], align 8
-// CHECK: %[[COERCE_VAL_PI:.*]] = ptrtoint ptr %[[V0]] to i64
-// CHECK: call void @_Z14testParamSmall5Small(i64 %[[COERCE_VAL_PI]])
+// CHECK: call void @_Z14testParamSmall5Small(ptr %[[V0]])
 // CHECK: %[[CALL2:.*]] = call noundef ptr @_ZN5SmallD1Ev(ptr {{[^,]*}} %[[T]])
 // CHECK: ret void
 // CHECK: }
@@ -120,8 +118,7 @@ void testCallSmall0() {
 // CHECK: store ptr %[[COERCE_VAL_IP]], ptr %[[COERCE_DIVE]], align 8
 // CHECK: %[[COERCE_DIVE1:.*]] = getelementptr inbounds nuw %[[STRUCT_SMALL]], ptr %[[AGG_TMP]], i32 0, i32 0
 // CHECK: %[[V0:.*]] = load ptr, ptr %[[COERCE_DIVE1]], align 8
-// CHECK: %[[COERCE_VAL_PI:.*]] = ptrtoint ptr %[[V0]] to i64
-// CHECK: call void @_Z14testParamSmall5Small(i64 %[[COERCE_VAL_PI]])
+// CHECK: call void @_Z14testParamSmall5Small(ptr %[[V0]])
 // CHECK: ret void
 // CHECK: }
 
@@ -226,7 +223,7 @@ NonTrivial testReturnHasNonTrivial() {
 // CHECK: call noundef ptr @_ZN5SmallC1Ev(ptr {{[^,]*}} %[[AGG_TMP]])
 // CHECK: invoke noundef ptr @_ZN5SmallC1Ev(ptr {{[^,]*}} %[[AGG_TMP1]])
 
-// CHECK: call void @_Z20calleeExceptionSmall5SmallS_(i64 %{{.*}}, i64 %{{.*}})
+// CHECK: call void @_Z20calleeExceptionSmall5SmallS_(ptr %{{.*}}, ptr %{{.*}})
 // CHECK-NEXT: ret void
 
 // CHECK: landingpad { ptr, i32 }

efriedma-quic

The general idea here makes sense.

clang/lib/CodeGen/Targets/AArch64.cpp

davemgreen · 2025-04-29T08:39:37Z

Rebase and ping - thanks.

clang/lib/CodeGen/Targets/AArch64.cpp

The aim here is to avoid a ptrtoint->inttoptr round-trip throught the function argument whilst keeping the calling convention the same. Given a struct which is <= 128bits in size, which can only contain either 1 or 2 pointers, we convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or [2 x i64].

davemgreen · 2025-06-05T11:01:23Z

Rebase and ping - it feels like this is a decent compromise for keeping the code simple.

efriedma-quic

LGTM

…lvm#135064) The aim here is to avoid a ptrtoint->inttoptr round-trip through the function argument whilst keeping the calling convention the same. Given a struct which is <= 128bits in size, which can only contain either 1 or 2 pointers, we convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or [2 x i64]. This helps alias analysis produce more accurate results.

Caslyn · 2025-06-12T17:25:50Z

Hi @davemgreen - after this commit, we’ve observed clang++ crashes compiling existing code with:

clang++: llvm/lib/IR/Instructions.cpp:3038: static CastInst *llvm::CastInst::Create(Instruction::CastOps, Value *, Type *, const Twine &, InsertPosition): Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed.

Here’s a reduced reproducer:

// bin/clang -cc1 -triple aarch64-unknown-fuchsia -target-feature -aes -target-feature -crypto -target-feature -fp-armv8 -target-feature -neon -target-feature -sha2 -fno-rtti -emit-llvm

struct RomTable {
  template <typename IoProvider, typename ComponentCallback>
  static void Walk(IoProvider, int, ComponentCallback) {};
};
struct RegisterMmioScaled {
  RegisterMmioScaled(void *);
  __attribute__((address_space(100))) char *mmio_;
};
long addr;
int size;
void WalkRomTable() {
  RegisterMmioScaled mmio(reinterpret_cast<void *>(addr));
  RomTable::Walk(mmio, size, [] {});
}

Could you please take a look and revert this change if this will require a non-trivial fix?

davemgreen · 2025-06-12T18:50:02Z

Thanks for the report - I'll put together a fix to exclude address spaces for the moment.

…pes. As reported on #135064, the generic pointer coercion code in CoerceIntOrPtrToIntOrPtr cannot handle address space casts (it tries to bitcast the pointers). This bails out if an address space qualifier is found on the pointer.

…lvm#135064) The aim here is to avoid a ptrtoint->inttoptr round-trip through the function argument whilst keeping the calling convention the same. Given a struct which is <= 128bits in size, which can only contain either 1 or 2 pointers, we convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or [2 x i64]. This helps alias analysis produce more accurate results.

…pes. As reported on llvm#135064, the generic pointer coercion code in CoerceIntOrPtrToIntOrPtr cannot handle address space casts (it tries to bitcast the pointers). This bails out if an address space qualifier is found on the pointer.

eaeltsin · 2025-06-20T17:53:47Z

heads up - this change causes tag-mismatch under hwasan on a bunch of our internal tests.

eaeltsin · 2025-06-20T18:30:50Z

The issue seems to be with pointers to functions, adding

           FDTy = getContext().getBaseElementType(FDTy);
         return (FDTy->isPointerOrReferenceType() &&
                 getContext().getTypeSize(FDTy) == 64 &&
-                !FDTy->getPointeeType().hasAddressSpace()) ||
+                !FDTy->getPointeeType().hasAddressSpace() &&
+                !FDTy->getPointeeType()->isFunctionType()) ||
                Self(Self, FDTy);
       });
     };

to clang/lib/CodeGen/Targets/AArch64.cpp:512 makes the tests pass.

The problematic type in our case seems to be abseil::FunctionRef

Please let me know if this is sufficient, reducing internal test to a standalone reproducer might take some time.

davemgreen · 2025-06-20T19:59:17Z

The issue seems to be with pointers to functions, adding
           FDTy = getContext().getBaseElementType(FDTy);
         return (FDTy->isPointerOrReferenceType() &&
                 getContext().getTypeSize(FDTy) == 64 &&
-                !FDTy->getPointeeType().hasAddressSpace()) ||
+                !FDTy->getPointeeType().hasAddressSpace() &&
+                !FDTy->getPointeeType()->isFunctionType()) ||
                Self(Self, FDTy);
       });
     };
to clang/lib/CodeGen/Targets/AArch64.cpp:512 makes the tests pass.

The problematic type in our case seems to be abseil::FunctionRef

Please let me know if this is sufficient, reducing internal test to a standalone reproducer might take some time.

Hi - that might break some of my motivating examples (they were pointers passed in structs from lambda captures, I don't remember if the pointers were to functions or not, there were definitely a lot of function pointers flying around). In general I would have expected it to be better to avoid the ptrtoint->inttoptr when passed through a function argument though. Can we make it bail out for hwasan only with function pointers?

I was trying with code like https://godbolt.org/z/xfW3vcjeo to see if I could hit the same issue (but imagine running on an AArch64 machine). Any ideas what it might take to trigger it (different option, certain code pattern?)

eaeltsin · 2025-06-21T07:05:07Z

The tests fail with SIGSEGV without hwasan, so there is some problem revealed by this change either in the compiler or in the tests. Bailing out under hwasan is not a fix.

Disallowing function pointers makes the problem go away, so might be a useful hint. Another hint is that the issue seems to be limited to 128-bit structs only.

We are trying to cook a standalone reproducer.

vitalybuka · 2025-06-23T23:51:03Z

The tests fail with SIGSEGV without hwasan, so there is some problem revealed by this change either in the compiler or in the tests. Bailing out under hwasan is not a fix.

Disallowing function pointers makes the problem go away, so might be a useful hint. Another hint is that the issue seems to be limited to 128-bit structs only.

We are trying to cook a standalone reproducer.

@eaeltsin Isn't the original issue are new SIGSEGV on non-hwasan arm builds?

davemgreen · 2025-06-24T10:56:15Z

I haven't seen anything else like that I don't believe, and my understanding is the new IR should be simpler than before. Let me know if you have anything that can show the issue you are running into. Thanks

…pes. As reported on llvm#135064, the generic pointer coercion code in CoerceIntOrPtrToIntOrPtr cannot handle address space casts (it tries to bitcast the pointers). This bails out if an address space qualifier is found on the pointer.

eaeltsin · 2025-06-24T21:26:44Z

Sorry I don't have much to add so far. I can reproduce the problem in a pretty complex context (co_await inside the loop, LICM pass seems to make the tests fail), but cannot provide a reduced public reproducer yet. Still looking.

davemgreen requested review from ostannard, momchil-velikov, nasherm and efriedma-quic April 9, 2025 18:21

llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:codegen IR generation bugs: mangling, exceptions, etc. labels Apr 9, 2025

efriedma-quic reviewed Apr 9, 2025

View reviewed changes

clang/lib/CodeGen/Targets/AArch64.cpp Outdated Show resolved Hide resolved

davemgreen force-pushed the gh-a64-coerceptr branch from c116767 to 6d0fa9f Compare April 10, 2025 16:20

efriedma-quic reviewed Apr 10, 2025

View reviewed changes

clang/lib/CodeGen/Targets/AArch64.cpp Outdated Show resolved Hide resolved

efriedma-quic reviewed Apr 14, 2025

View reviewed changes

clang/lib/CodeGen/Targets/AArch64.cpp Outdated Show resolved Hide resolved

clang/lib/CodeGen/Targets/AArch64.cpp Outdated Show resolved Hide resolved

davemgreen force-pushed the gh-a64-coerceptr branch 3 times, most recently from 84e87d5 to 9f23881 Compare April 29, 2025 08:39

davemgreen force-pushed the gh-a64-coerceptr branch from 9f23881 to e3d22cc Compare May 2, 2025 13:22

efriedma-quic reviewed May 6, 2025

View reviewed changes

clang/lib/CodeGen/Targets/AArch64.cpp Outdated Show resolved Hide resolved

clang/lib/CodeGen/Targets/AArch64.cpp Show resolved Hide resolved

clang/lib/CodeGen/Targets/AArch64.cpp Show resolved Hide resolved

davemgreen force-pushed the gh-a64-coerceptr branch from e3d22cc to 9ff01ee Compare May 7, 2025 14:26

davemgreen force-pushed the gh-a64-coerceptr branch from 9ff01ee to 9a56ee3 Compare May 15, 2025 19:36

davemgreen force-pushed the gh-a64-coerceptr branch from 9a56ee3 to 6667abf Compare June 5, 2025 11:01

efriedma-quic approved these changes Jun 9, 2025

View reviewed changes

davemgreen merged commit 5f648c3 into llvm:main Jun 10, 2025
11 checks passed

davemgreen deleted the gh-a64-coerceptr branch June 10, 2025 06:04

[AArch64] Change the coercion type of structs with pointer members. #135064

[AArch64] Change the coercion type of structs with pointer members. #135064

Uh oh!

Conversation

davemgreen commented Apr 9, 2025

Uh oh!

llvmbot commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Apr 9, 2025

Uh oh!

efriedma-quic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davemgreen commented Apr 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davemgreen commented Jun 5, 2025

Uh oh!

efriedma-quic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Caslyn commented Jun 12, 2025

Uh oh!

davemgreen commented Jun 12, 2025

Uh oh!

eaeltsin commented Jun 20, 2025

Uh oh!

eaeltsin commented Jun 20, 2025

Uh oh!

davemgreen commented Jun 20, 2025

Uh oh!

eaeltsin commented Jun 21, 2025

Uh oh!

vitalybuka commented Jun 23, 2025

Uh oh!

davemgreen commented Jun 24, 2025

Uh oh!

eaeltsin commented Jun 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Apr 9, 2025 •

edited

Loading

eaeltsin commented Jun 24, 2025 •

edited

Loading