[clang][Sema][SYCL] Fix MSVC STL usage on AMDGPU (#135979)

sarnex · web-flow · commit 257b72758424 · 2025-04-18T15:28:46.000Z
The MSVC STL includes specializations of `_Is_memfunptr` for every function pointer type, including every calling convention. The problem is the AMDGPU target doesn't support the x86 `vectorcall` calling convention so clang sets it to the default CC. This ends up clashing with the already-existing overload for the default CC, so we get a duplicate definition error when including `type_traits` (which we heavily use in the SYCL STL) and compiling for AMDGPU on Windows. This doesn't happen for pure AMDGPU non-SYCL because it doesn't include the C++ STL, and it doesn't happen for CUDA/HIP because a similar workaround was done [here](fa49c3a). I am not an expert in Sema, so I did a kinda of hardcoded fix, please let me know if there is a better way to fix this. As far as I can tell we can't do exactly the same fix that was done for CUDA because we can't differentiate between device and host code so easily. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -5495,12 +5495,12 @@ bool Sema::CheckCallingConvAttr(const ParsedAttr &Attrs, CallingConv &CC,
 
   TargetInfo::CallingConvCheckResult A = TargetInfo::CCCR_OK;
   const TargetInfo &TI = Context.getTargetInfo();
+  auto *Aux = Context.getAuxTargetInfo();
   // CUDA functions may have host and/or device attributes which indicate
   // their targeted execution environment, therefore the calling convention
   // of functions in CUDA should be checked against the target deduced based
   // on their host/device attributes.
   if (LangOpts.CUDA) {
-    auto *Aux = Context.getAuxTargetInfo();
     assert(FD || CFT != CUDAFunctionTarget::InvalidTarget);
     auto CudaTarget = FD ? CUDA().IdentifyTarget(FD) : CFT;
     bool CheckHost = false, CheckDevice = false;
@@ -5525,6 +5525,15 @@ bool Sema::CheckCallingConvAttr(const ParsedAttr &Attrs, CallingConv &CC,
       A = HostTI->checkCallingConvention(CC);
     if (A == TargetInfo::CCCR_OK && CheckDevice && DeviceTI)
       A = DeviceTI->checkCallingConvention(CC);
+  } else if (LangOpts.SYCLIsDevice && TI.getTriple().isAMDGPU() &&
+             CC == CC_X86VectorCall) {
+    // Assuming SYCL Device AMDGPU CC_X86VectorCall functions are always to be
+    // emitted on the host. The MSVC STL has CC-based specializations so we
+    // cannot change the CC to be the default as that will cause a clash with
+    // another specialization.
+    A = TI.checkCallingConvention(CC);
+    if (Aux && A != TargetInfo::CCCR_OK)
+      A = Aux->checkCallingConvention(CC);
   } else {
     A = TI.checkCallingConvention(CC);
   }
diff --git a/clang/test/SemaSYCL/Inputs/vectorcall.hpp b/clang/test/SemaSYCL/Inputs/vectorcall.hpp
@@ -0,0 +1,18 @@
+
+template <typename F> struct A{};
+
+template <typename Ret, typename C, typename... Args> struct A<Ret (             C::*)(Args...) noexcept> { static constexpr int value = 0; };
+template <typename Ret, typename C, typename... Args> struct A<Ret (__vectorcall C::*)(Args...) noexcept> { static constexpr int value = 1; };
+
+template <typename F> constexpr int A_v = A<F>::value;
+
+struct B
+{
+    void f() noexcept {}
+    void __vectorcall g() noexcept {}
+};
+
+int main()
+{
+    return A_v<decltype(&B::f)> + A_v<decltype(&B::g)>;
+}
diff --git a/clang/test/SemaSYCL/sycl-cconv-win.cpp b/clang/test/SemaSYCL/sycl-cconv-win.cpp
@@ -0,0 +1,5 @@
+// RUN: %clang_cc1 -isystem %S/Inputs/ -fsycl-is-device -triple amdgcn-amd-hsa -aux-triple x86_64-pc-windows-msvc -fsyntax-only -verify %s
+
+// expected-no-diagnostics
+
+#include <vectorcall.hpp>