Skip to content

[WIP][Clang] Allow floating point fixed vectors with atomic builtins #129495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

vikramRH
Copy link
Contributor

@vikramRH vikramRH commented Mar 3, 2025

#86796 added support for atomicrmw FP ops with fixed vector types. This patch intends to allow the same with clang atomic builtins. Any comments/concerns here would be helpful..

@vikramRH vikramRH requested review from arsenm, jyknight and yxsamliu March 3, 2025 08:35
Comment on lines 534 to 539
auto IsFloat = E->getValueType()->isVectorType()
? E->getValueType()
->castAs<VectorType>()
->getElementType()
->isFloatingType()
: E->getValueType()->isFloatingType();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No auto and this is much more verbose than necessary. Definitely should not have the isVector + castAs pattern

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did look for a nicer way to query and unfortunately didnt find one. should we add a new helper here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a new helper , which also handles complex type rejection now..

// FLOAT: atomicrmw fsub ptr {{.*}} <2 x float> {{.*}} monotonic
__atomic_fetch_sub(f, ff, memory_order_relaxed);

#ifdef DOUBLE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know why this macro check is here, but the unguarded typedef probably defeats the point. Can we just not have it

@@ -2312,6 +2312,17 @@ bool Type::isRealType() const {
return isBitIntType();
}

bool Type::isFPAtomicCompatibleType() const {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this is the right place for the predicate, ASTContext seemed like a good option too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It does not depend on other factors.

@vikramRH vikramRH requested a review from rjmccall March 20, 2025 10:05
@vikramRH vikramRH marked this pull request as ready for review March 21, 2025 11:24
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Mar 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 21, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Vikram Hegde (vikramRH)

Changes

#86796 added support for atomicrmw FP ops with fixed vector types. This patch intends to allow the same with clang atomic builtins. Any comments/concerns here would be helpful..


Full diff: https://github.com/llvm/llvm-project/pull/129495.diff

7 Files Affected:

  • (modified) clang/include/clang/AST/Type.h (+3)
  • (modified) clang/lib/AST/Type.cpp (+11)
  • (modified) clang/lib/CodeGen/CGAtomic.cpp (+14-20)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+6-5)
  • (modified) clang/test/CodeGen/fp-atomic-ops.c (+22)
  • (modified) clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu (+51)
  • (modified) clang/test/Sema/atomic-ops.c (+36-8)
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index c3ff7ebd88516..34f0037e83efc 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -2738,6 +2738,9 @@ class alignas(TypeAlignment) Type : public ExtQualsTypeCommonBase {
   /// Determine wither this type is a C++ elaborated-type-specifier.
   bool isElaboratedTypeSpecifier() const;
 
+  // check whether the type is compatible with fp atomics.
+  bool isFPAtomicCompatibleType() const;
+
   bool canDecayToPointerType() const;
 
   /// Whether this type is represented natively as a pointer.  This includes
diff --git a/clang/lib/AST/Type.cpp b/clang/lib/AST/Type.cpp
index 8c11ec2e1fe24..3b082443d0ce3 100644
--- a/clang/lib/AST/Type.cpp
+++ b/clang/lib/AST/Type.cpp
@@ -2312,6 +2312,17 @@ bool Type::isRealType() const {
   return isBitIntType();
 }
 
+bool Type::isFPAtomicCompatibleType() const {
+  if (isa<ComplexType>(CanonicalType))
+    return false;
+  if (const auto *CVT = dyn_cast<VectorType>(CanonicalType)) {
+    if (CVT->isSizelessVectorType())
+      return false;
+    return CVT->getElementType()->isFPAtomicCompatibleType();
+  }
+  return isFloatingType();
+}
+
 bool Type::isArithmeticType() const {
   if (const auto *BT = dyn_cast<BuiltinType>(CanonicalType))
     return BT->getKind() >= BuiltinType::Bool &&
diff --git a/clang/lib/CodeGen/CGAtomic.cpp b/clang/lib/CodeGen/CGAtomic.cpp
index 3adb2a7ad207f..776e989ef46cd 100644
--- a/clang/lib/CodeGen/CGAtomic.cpp
+++ b/clang/lib/CodeGen/CGAtomic.cpp
@@ -531,6 +531,7 @@ static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr *E, Address Dest,
   bool PostOpMinMax = false;
   unsigned PostOp = 0;
 
+  bool IsFloat = E->getValueType()->isFPAtomicCompatibleType();
   switch (E->getOp()) {
   case AtomicExpr::AO__c11_atomic_init:
   case AtomicExpr::AO__opencl_atomic_init:
@@ -620,30 +621,26 @@ static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr *E, Address Dest,
 
   case AtomicExpr::AO__atomic_add_fetch:
   case AtomicExpr::AO__scoped_atomic_add_fetch:
-    PostOp = E->getValueType()->isFloatingType() ? llvm::Instruction::FAdd
-                                                 : llvm::Instruction::Add;
+    PostOp = IsFloat ? llvm::Instruction::FAdd : llvm::Instruction::Add;
     [[fallthrough]];
   case AtomicExpr::AO__c11_atomic_fetch_add:
   case AtomicExpr::AO__hip_atomic_fetch_add:
   case AtomicExpr::AO__opencl_atomic_fetch_add:
   case AtomicExpr::AO__atomic_fetch_add:
   case AtomicExpr::AO__scoped_atomic_fetch_add:
-    Op = E->getValueType()->isFloatingType() ? llvm::AtomicRMWInst::FAdd
-                                             : llvm::AtomicRMWInst::Add;
+    Op = IsFloat ? llvm::AtomicRMWInst::FAdd : llvm::AtomicRMWInst::Add;
     break;
 
   case AtomicExpr::AO__atomic_sub_fetch:
   case AtomicExpr::AO__scoped_atomic_sub_fetch:
-    PostOp = E->getValueType()->isFloatingType() ? llvm::Instruction::FSub
-                                                 : llvm::Instruction::Sub;
+    PostOp = IsFloat ? llvm::Instruction::FSub : llvm::Instruction::Sub;
     [[fallthrough]];
   case AtomicExpr::AO__c11_atomic_fetch_sub:
   case AtomicExpr::AO__hip_atomic_fetch_sub:
   case AtomicExpr::AO__opencl_atomic_fetch_sub:
   case AtomicExpr::AO__atomic_fetch_sub:
   case AtomicExpr::AO__scoped_atomic_fetch_sub:
-    Op = E->getValueType()->isFloatingType() ? llvm::AtomicRMWInst::FSub
-                                             : llvm::AtomicRMWInst::Sub;
+    Op = IsFloat ? llvm::AtomicRMWInst::FSub : llvm::AtomicRMWInst::Sub;
     break;
 
   case AtomicExpr::AO__atomic_min_fetch:
@@ -655,11 +652,10 @@ static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr *E, Address Dest,
   case AtomicExpr::AO__opencl_atomic_fetch_min:
   case AtomicExpr::AO__atomic_fetch_min:
   case AtomicExpr::AO__scoped_atomic_fetch_min:
-    Op = E->getValueType()->isFloatingType()
-             ? llvm::AtomicRMWInst::FMin
-             : (E->getValueType()->isSignedIntegerType()
-                    ? llvm::AtomicRMWInst::Min
-                    : llvm::AtomicRMWInst::UMin);
+    Op = IsFloat ? llvm::AtomicRMWInst::FMin
+                 : (E->getValueType()->isSignedIntegerType()
+                        ? llvm::AtomicRMWInst::Min
+                        : llvm::AtomicRMWInst::UMin);
     break;
 
   case AtomicExpr::AO__atomic_max_fetch:
@@ -671,11 +667,10 @@ static void EmitAtomicOp(CodeGenFunction &CGF, AtomicExpr *E, Address Dest,
   case AtomicExpr::AO__opencl_atomic_fetch_max:
   case AtomicExpr::AO__atomic_fetch_max:
   case AtomicExpr::AO__scoped_atomic_fetch_max:
-    Op = E->getValueType()->isFloatingType()
-             ? llvm::AtomicRMWInst::FMax
-             : (E->getValueType()->isSignedIntegerType()
-                    ? llvm::AtomicRMWInst::Max
-                    : llvm::AtomicRMWInst::UMax);
+    Op = IsFloat ? llvm::AtomicRMWInst::FMax
+                 : (E->getValueType()->isSignedIntegerType()
+                        ? llvm::AtomicRMWInst::Max
+                        : llvm::AtomicRMWInst::UMax);
     break;
 
   case AtomicExpr::AO__atomic_and_fetch:
@@ -984,9 +979,8 @@ RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E) {
   case AtomicExpr::AO__scoped_atomic_max_fetch:
   case AtomicExpr::AO__scoped_atomic_min_fetch:
   case AtomicExpr::AO__scoped_atomic_sub_fetch:
-    ShouldCastToIntPtrTy = !MemTy->isFloatingType();
+    ShouldCastToIntPtrTy = !MemTy->isFPAtomicCompatibleType();
     [[fallthrough]];
-
   case AtomicExpr::AO__atomic_fetch_and:
   case AtomicExpr::AO__atomic_fetch_nand:
   case AtomicExpr::AO__atomic_fetch_or:
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index f9926c6b4adab..d1021cfef764e 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -3758,7 +3758,7 @@ ExprResult Sema::BuildAtomicExpr(SourceRange CallRange, SourceRange ExprRange,
   enum ArithOpExtraValueType {
     AOEVT_None = 0,
     AOEVT_Pointer = 1,
-    AOEVT_FP = 2,
+    AOEVT_FPorFPVec = 2,
   };
   unsigned ArithAllows = AOEVT_None;
 
@@ -3804,7 +3804,7 @@ ExprResult Sema::BuildAtomicExpr(SourceRange CallRange, SourceRange ExprRange,
   case AtomicExpr::AO__opencl_atomic_fetch_sub:
   case AtomicExpr::AO__hip_atomic_fetch_add:
   case AtomicExpr::AO__hip_atomic_fetch_sub:
-    ArithAllows = AOEVT_Pointer | AOEVT_FP;
+    ArithAllows = AOEVT_Pointer | AOEVT_FPorFPVec;
     Form = Arithmetic;
     break;
   case AtomicExpr::AO__atomic_fetch_max:
@@ -3821,7 +3821,7 @@ ExprResult Sema::BuildAtomicExpr(SourceRange CallRange, SourceRange ExprRange,
   case AtomicExpr::AO__opencl_atomic_fetch_min:
   case AtomicExpr::AO__hip_atomic_fetch_max:
   case AtomicExpr::AO__hip_atomic_fetch_min:
-    ArithAllows = AOEVT_FP;
+    ArithAllows = AOEVT_FPorFPVec;
     Form = Arithmetic;
     break;
   case AtomicExpr::AO__c11_atomic_fetch_and:
@@ -3982,7 +3982,8 @@ ExprResult Sema::BuildAtomicExpr(SourceRange CallRange, SourceRange ExprRange,
         return true;
       if (ValType->isPointerType())
         return AllowedType & AOEVT_Pointer;
-      if (!(ValType->isFloatingType() && (AllowedType & AOEVT_FP)))
+      if (!(ValType->isFPAtomicCompatibleType() &&
+            (AllowedType & AOEVT_FPorFPVec)))
         return false;
       // LLVM Parser does not allow atomicrmw with x86_fp80 type.
       if (ValType->isSpecificBuiltinType(BuiltinType::LongDouble) &&
@@ -3992,7 +3993,7 @@ ExprResult Sema::BuildAtomicExpr(SourceRange CallRange, SourceRange ExprRange,
       return true;
     };
     if (!IsAllowedValueType(ValType, ArithAllows)) {
-      auto DID = ArithAllows & AOEVT_FP
+      auto DID = ArithAllows & AOEVT_FPorFPVec
                      ? (ArithAllows & AOEVT_Pointer
                             ? diag::err_atomic_op_needs_atomic_int_ptr_or_fp
                             : diag::err_atomic_op_needs_atomic_int_or_fp)
diff --git a/clang/test/CodeGen/fp-atomic-ops.c b/clang/test/CodeGen/fp-atomic-ops.c
index c894e7b4ade37..0e17c3278fbee 100644
--- a/clang/test/CodeGen/fp-atomic-ops.c
+++ b/clang/test/CodeGen/fp-atomic-ops.c
@@ -27,6 +27,9 @@ typedef enum memory_order {
   memory_order_seq_cst = __ATOMIC_SEQ_CST
 } memory_order;
 
+typedef float float2 __attribute__((ext_vector_type(2)));
+typedef double double2 __attribute__((ext_vector_type(2)));
+
 void test(float *f, float ff, double *d, double dd) {
   // FLOAT: atomicrmw fadd ptr {{.*}} monotonic
   __atomic_fetch_add(f, ff, memory_order_relaxed);
@@ -42,3 +45,22 @@ void test(float *f, float ff, double *d, double dd) {
   __atomic_fetch_sub(d, dd, memory_order_relaxed);
 #endif
 }
+
+typedef float float2 __attribute__((ext_vector_type(2)));
+typedef double double2 __attribute__((ext_vector_type(2)));
+
+void test_vector(float2 *f, float2 ff, double2 *d, double2 dd) {
+  // FLOAT: atomicrmw fadd ptr {{.*}} <2 x float> {{.*}} monotonic
+  __atomic_fetch_add(f, ff, memory_order_relaxed);
+
+  // FLOAT: atomicrmw fsub ptr {{.*}} <2 x float> {{.*}} monotonic
+  __atomic_fetch_sub(f, ff, memory_order_relaxed);
+
+#ifdef DOUBLE
+  // DOUBLE: atomicrmw fadd ptr {{.*}} <2 x double> {{.*}}  monotonic
+  __atomic_fetch_add(d, dd, memory_order_relaxed);
+
+  // DOUBLE: atomicrmw fsub ptr {{.*}} <2 x double> {{.*}} monotonic
+  __atomic_fetch_sub(d, dd, memory_order_relaxed);
+#endif
+}
diff --git a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
index 37fca614c3111..6afb39d6f8405 100644
--- a/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
+++ b/clang/test/CodeGenCUDA/amdgpu-atomic-ops.cu
@@ -20,6 +20,8 @@
 #include "Inputs/cuda.h"
 #include <stdatomic.h>
 
+typedef float __attribute__((ext_vector_type(2))) vector_float;
+
 __global__ void ffp1(float *p) {
   // CHECK-LABEL: @_Z4ffp1Pf
   // SAFEIR: atomicrmw fadd ptr {{.*}} monotonic, align 4{{$}}
@@ -225,6 +227,55 @@ __global__ void ffp6(_Float16 *p) {
   __hip_atomic_fetch_min(p, 1.0f, memory_order_relaxed, __HIP_MEMORY_SCOPE_WORKGROUP);
 }
 
+__global__ void ffp7(vector_float *p) {
+  // CHECK-LABEL: @_Z4ffp7PDv2_f
+  // SAFEIR: atomicrmw fadd ptr {{.*}}<2 x float>{{.*}} monotonic, align 8{{$}}
+  // SAFEIR: atomicrmw fsub ptr {{.*}}<2 x float>{{.*}} monotonic, align 8{{$}}
+  // SAFEIR: atomicrmw fmax ptr {{.*}}<2 x float>{{.*}} monotonic, align 8{{$}}
+  // SAFEIR: atomicrmw fmin ptr {{.*}}<2 x float>{{.*}} monotonic, align 8{{$}}
+  // SAFEIR: atomicrmw fadd ptr {{.*}}<2 x float>{{.*}} syncscope("agent") monotonic, align 8, !noalias.addrspace ![[$NO_PRIVATE]]{{$}}
+  // SAFEIR: atomicrmw fsub ptr {{.*}}<2 x float>{{.*}} syncscope("workgroup") monotonic, align 8, !noalias.addrspace ![[$NO_PRIVATE]]{{$}}
+  // SAFEIR: atomicrmw fmax ptr {{.*}}<2 x float>{{.*}} syncscope("agent") monotonic, align 8, !noalias.addrspace ![[$NO_PRIVATE]]{{$}}
+  // SAFEIR: atomicrmw fmin ptr {{.*}}<2 x float>{{.*}} syncscope("workgroup") monotonic, align 8, !noalias.addrspace ![[$NO_PRIVATE]]{{$}}
+
+  // UNSAFEIR: atomicrmw fadd ptr {{.*}}<2 x float>{{.*}} monotonic, align 8, !amdgpu.no.fine.grained.memory !{{[0-9]+$}}
+  // UNSAFEIR: atomicrmw fsub ptr {{.*}}<2 x float>{{.*}} monotonic, align 8, !amdgpu.no.fine.grained.memory !{{[0-9]+$}}
+  // UNSAFEIR: atomicrmw fmax ptr {{.*}}<2 x float>{{.*}} monotonic, align 8, !amdgpu.no.fine.grained.memory !{{[0-9]+$}}
+  // UNSAFEIR: atomicrmw fmin ptr {{.*}}<2 x float>{{.*}} monotonic, align 8, !amdgpu.no.fine.grained.memory !{{[0-9]+$}}
+  // UNSAFEIR: atomicrmw fadd ptr {{.*}}<2 x float>{{.*}} monotonic, align 8, !noalias.addrspace ![[$NO_PRIVATE]], !amdgpu.no.fine.grained.memory !{{[0-9]+$}}
+  // UNSAFEIR: atomicrmw fsub ptr {{.*}}<2 x float>{{.*}} monotonic, align 8, !noalias.addrspace ![[$NO_PRIVATE]], !amdgpu.no.fine.grained.memory !{{[0-9]+$}}
+  // UNSAFEIR: atomicrmw fmax ptr {{.*}}<2 x float>{{.*}} syncscope("agent") monotonic, align 8, !noalias.addrspace ![[$NO_PRIVATE]], !amdgpu.no.fine.grained.memory !{{[0-9]+$}}
+  // UNSAFEIR: atomicrmw fmin ptr {{.*}}<2 x float>{{.*}} syncscope("workgroup") monotonic, align 8, !noalias.addrspace ![[$NO_PRIVATE]], !amdgpu.no.fine.grained.memory !{{[0-9]+$}}
+
+  // SAFE: _Z4ffp7PDv2_f
+  // SAFE: global_atomic_cmpswap
+  // SAFE: global_atomic_cmpswap
+  // SAFE: global_atomic_cmpswap
+  // SAFE: global_atomic_cmpswap
+  // SAFE: global_atomic_cmpswap
+  // SAFE: global_atomic_cmpswap
+  // SAFE: global_atomic_cmpswap
+  // SAFE: global_atomic_cmpswap
+
+  // UNSAFE: _Z4ffp7PDv2_f
+  // UNSAFE: global_atomic_cmpswap
+  // UNSAFE: global_atomic_cmpswap
+  // UNSAFE: global_atomic_cmpswap
+  // UNSAFE: global_atomic_cmpswap
+  // UNSAFE: global_atomic_cmpswap
+  // UNSAFE: global_atomic_cmpswap
+  // UNSAFE: global_atomic_cmpswap
+  // UNSAFE: global_atomic_cmpswap
+  __atomic_fetch_add(p, {1.0f, 1.0f}, memory_order_relaxed);
+  __atomic_fetch_sub(p, {1.0f, 1.0f}, memory_order_relaxed);
+  __atomic_fetch_max(p, {1.0f, 1.0f}, memory_order_relaxed);
+  __atomic_fetch_min(p, {1.0f, 1.0f}, memory_order_relaxed);
+  __hip_atomic_fetch_add(p, {1.0f, 1.0f}, memory_order_relaxed, __HIP_MEMORY_SCOPE_AGENT);
+  __hip_atomic_fetch_sub(p, {1.0f, 1.0f}, memory_order_relaxed, __HIP_MEMORY_SCOPE_WORKGROUP);
+  __hip_atomic_fetch_max(p, {1.0f, 1.0f}, memory_order_relaxed, __HIP_MEMORY_SCOPE_AGENT);
+  __hip_atomic_fetch_min(p, {1.0f, 1.0f}, memory_order_relaxed, __HIP_MEMORY_SCOPE_WORKGROUP);
+}
+
 // CHECK-LABEL: @_Z12test_cmpxchgPiii
 // CHECK: cmpxchg ptr %{{.+}}, i32 %{{.+}}, i32 %{{.+}} acquire acquire, align 4{{$}}
 // CHECK: cmpxchg weak ptr %{{.+}}, i32 %{{.+}}, i32 %{{.+}} acquire acquire, align 4{{$}}
diff --git a/clang/test/Sema/atomic-ops.c b/clang/test/Sema/atomic-ops.c
index 725a12060d4e0..03d57517d1571 100644
--- a/clang/test/Sema/atomic-ops.c
+++ b/clang/test/Sema/atomic-ops.c
@@ -1,19 +1,19 @@
-// RUN: %clang_cc1 %s -verify=expected,fp80,noi128 -fgnuc-version=4.2.1 -ffreestanding \
+// RUN: %clang_cc1 %s -verify=expected,fp80,noi128 -fenable-matrix -fgnuc-version=4.2.1 -ffreestanding \
 // RUN:   -fsyntax-only -triple=i686-linux-gnu -std=c11
-// RUN: %clang_cc1 %s -verify=expected,noi128 -fgnuc-version=4.2.1 -ffreestanding \
+// RUN: %clang_cc1 %s -verify=expected,noi128 -fenable-matrix -fgnuc-version=4.2.1 -ffreestanding \
 // RUN:   -fsyntax-only -triple=i686-linux-android -std=c11
-// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -ffreestanding \
+// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -fenable-matrix -ffreestanding \
 // RUN:   -fsyntax-only -triple=powerpc64-linux-gnu -std=c11
-// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -ffreestanding \
+// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -fenable-matrix -ffreestanding \
 // RUN:   -fsyntax-only -triple=powerpc64-linux-gnu -std=c11 \
 // RUN:   -target-cpu pwr7
-// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -ffreestanding \
+// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -fenable-matrix -ffreestanding \
 // RUN:   -fsyntax-only -triple=powerpc64le-linux-gnu -std=c11 \
 // RUN:   -target-cpu pwr8 -DPPC64_PWR8
-// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -ffreestanding \
+// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -fenable-matrix -ffreestanding \
 // RUN:   -fsyntax-only -triple=powerpc64-unknown-aix -std=c11 \
 // RUN:   -target-cpu pwr8
-// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -ffreestanding \
+// RUN: %clang_cc1 %s -verify -fgnuc-version=4.2.1 -fenable-matrix -ffreestanding \
 // RUN:   -fsyntax-only -triple=powerpc64-unknown-aix -std=c11 \
 // RUN:   -mabi=quadword-atomics -target-cpu pwr8 -DPPC64_PWR8
 
@@ -147,7 +147,15 @@ _Static_assert(__atomic_always_lock_free(2, (int[2]){}), "");
 void dummyfn();
 _Static_assert(__atomic_always_lock_free(2, dummyfn) || 1, "");
 
+typedef _Atomic(float __attribute__((vector_size(16)))) atomic_vector_float;
+typedef _Atomic(double __attribute__((vector_size(16)))) atomic_vector_double;
+typedef _Atomic(int __attribute__((vector_size(16)))) atomic_vector_int;
+typedef float __attribute__((ext_vector_type(4))) vector_float;
+typedef double __attribute__((ext_vector_type(2))) vector_double;
+typedef int __attribute__((ext_vector_type(4))) vector_int;
 
+typedef float float_mat_5x5 __attribute__((matrix_type(5, 5)));
+typedef _Complex double ComplexDouble;
 
 #define _AS1 __attribute__((address_space(1)))
 #define _AS2 __attribute__((address_space(2)))
@@ -156,7 +164,11 @@ void f(_Atomic(int) *i, const _Atomic(int) *ci,
        _Atomic(int*) *p, _Atomic(float) *f, _Atomic(double) *d,
        _Atomic(long double) *ld,
        int *I, const int *CI,
-       int **P, float *F, double *D, struct S *s1, struct S *s2) {
+       int **P, float *F, double *D, struct S *s1, struct S *s2,
+       atomic_vector_float* vf, atomic_vector_double* vd, 
+       atomic_vector_int* vi, vector_float* evf, 
+       vector_double* evd, vector_int* evi, float_mat_5x5* fm,
+       ComplexDouble* cd) {
   __c11_atomic_init(I, 5); // expected-error {{pointer to _Atomic}}
   __c11_atomic_init(ci, 5); // expected-error {{address argument to atomic operation must be a pointer to non-const _Atomic type ('const _Atomic(int) *' invalid)}}
 
@@ -224,6 +236,13 @@ void f(_Atomic(int) *i, const _Atomic(int) *ci,
   __c11_atomic_fetch_add(f, 1.0f, memory_order_seq_cst);
   __c11_atomic_fetch_add(d, 1.0, memory_order_seq_cst);
   __c11_atomic_fetch_add(ld, 1.0, memory_order_seq_cst); // fp80-error {{must be a pointer to atomic integer, pointer or supported floating point type}}
+
+  vector_float fvec = {1.0f, 1.0f, 1.0f, 1.0f};
+  vector_double dvec = {1.0, 1.0};
+  vector_int ivec = {1, 1, 1, 1};
+  __c11_atomic_fetch_add(vf, fvec, memory_order_seq_cst);
+  __c11_atomic_fetch_add(vd, dvec, memory_order_seq_cst);
+  __c11_atomic_fetch_add(vi, ivec, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer, pointer or supported floating point type}}
   __c11_atomic_fetch_min(i, 1, memory_order_seq_cst);
   __c11_atomic_fetch_min(p, 1, memory_order_seq_cst); // expected-error {{must be a pointer to atomic integer or supported floating point type}}
   __c11_atomic_fetch_min(f, 1.0f, memory_order_seq_cst);
@@ -240,6 +259,15 @@ void f(_Atomic(int) *i, const _Atomic(int) *ci,
   __atomic_fetch_sub(P, 3, memory_order_seq_cst);
   __atomic_fetch_sub(F, 3, memory_order_seq_cst);
   __atomic_fetch_sub(s1, 3, memory_order_seq_cst); // expected-error {{must be a pointer to integer, pointer or supported floating point type}}
+  __atomic_fetch_sub(evf, fvec, memory_order_seq_cst);
+  __atomic_fetch_sub(evd, dvec, memory_order_seq_cst);
+  __atomic_fetch_sub(evi, ivec, memory_order_seq_cst); // expected-error {{must be a pointer to integer, pointer or supported floating point type}}
+
+  float_mat_5x5 f1;
+  __atomic_fetch_sub(fm, f1, memory_order_seq_cst); // expected-error {{must be a pointer to integer, pointer or supported floating point type}}
+  ComplexDouble f2 = {1.0, 2.0};
+  __atomic_fetch_sub(cd, f2, memory_order_seq_cst); // expected-error {{must be a pointer to integer, pointer or supported floating point type}}
+
   __atomic_fetch_min(F, 3, memory_order_seq_cst);
   __atomic_fetch_min(D, 3, memory_order_seq_cst);
   __atomic_fetch_max(F, 3, memory_order_seq_cst);

@vikramRH
Copy link
Contributor Author

vikramRH commented Apr 3, 2025

Ping... been a while since we last visited this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants