Skip to content

[clang][Sema][SYCL] Fix MSVC STL usage on AMDGPU #135979

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 18, 2025
Merged

Conversation

sarnex
Copy link
Member

@sarnex sarnex commented Apr 16, 2025

The MSVC STL includes specializations of _Is_memfunptr for every function pointer type, including every calling convention.

The problem is the AMDGPU target doesn't support the x86 vectorcall calling convention so clang sets it to the default CC. This ends up clashing with the already-existing overload for the default CC, so we get a duplicate definition error when including type_traits (which we heavily use in the SYCL STL) and compiling for AMDGPU on Windows.

This doesn't happen for pure AMDGPU non-SYCL because it doesn't include the C++ STL, and it doesn't happen for CUDA/HIP because a similar workaround was done here.

I am not an expert in Sema, so I did a kinda of hardcoded fix, please let me know if there is a better way to fix this.

As far as I can tell we can't do exactly the same fix that was done for CUDA because we can't differentiate between device and host code so easily.

Copy link

github-actions bot commented Apr 16, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@sarnex sarnex force-pushed the amdgpu branch 2 times, most recently from 2dfcad2 to 6ed0ba1 Compare April 16, 2025 18:50
@sarnex sarnex marked this pull request as ready for review April 17, 2025 16:34
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Apr 17, 2025
@sarnex sarnex requested a review from Fznamznon April 17, 2025 16:34
@llvmbot
Copy link
Member

llvmbot commented Apr 17, 2025

@llvm/pr-subscribers-clang

Author: Nick Sarnie (sarnex)

Changes

The MSVC STL includes specializations of _Is_memfunptr for every function pointer type, including every calling convention.

The problem is the AMDGPU target doesn't support the x86 vectorcall calling convention so clang sets it to the default CC. This ends up clashing with the already-existing overload for the default CC, so we get a duplicate definition error when including type_traits (which we heavily use in the SYCL STL) and compiling for AMDGPU on Windows.

This doesn't happen for pure AMDGPU non-SYCL because it doesn't include the C++ STL, and it doesn't happen for CUDA/HIP because a similar workaround was done here.

I am not an expert in Sema, so I did a kinda of hardcoded fix, please let me know if there is a better way to fix this.

As far as I can tell we can't do exactly the same fix that was done for CUDA because we can't differentiate between device and host code so easily.


Full diff: https://github.com/llvm/llvm-project/pull/135979.diff

3 Files Affected:

  • (modified) clang/lib/Sema/SemaDeclAttr.cpp (+6-1)
  • (added) clang/test/SemaSYCL/Inputs/vectorcall.hpp (+18)
  • (added) clang/test/SemaSYCL/sycl-cconv-win.cpp (+5)
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index bc891fb009410..3594111d86ccc 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -5492,12 +5492,12 @@ bool Sema::CheckCallingConvAttr(const ParsedAttr &Attrs, CallingConv &CC,
 
   TargetInfo::CallingConvCheckResult A = TargetInfo::CCCR_OK;
   const TargetInfo &TI = Context.getTargetInfo();
+  auto *Aux = Context.getAuxTargetInfo();
   // CUDA functions may have host and/or device attributes which indicate
   // their targeted execution environment, therefore the calling convention
   // of functions in CUDA should be checked against the target deduced based
   // on their host/device attributes.
   if (LangOpts.CUDA) {
-    auto *Aux = Context.getAuxTargetInfo();
     assert(FD || CFT != CUDAFunctionTarget::InvalidTarget);
     auto CudaTarget = FD ? CUDA().IdentifyTarget(FD) : CFT;
     bool CheckHost = false, CheckDevice = false;
@@ -5522,6 +5522,11 @@ bool Sema::CheckCallingConvAttr(const ParsedAttr &Attrs, CallingConv &CC,
       A = HostTI->checkCallingConvention(CC);
     if (A == TargetInfo::CCCR_OK && CheckDevice && DeviceTI)
       A = DeviceTI->checkCallingConvention(CC);
+  } else if (LangOpts.SYCLIsDevice && TI.getTriple().isAMDGPU() &&
+             CC == CC_X86VectorCall) {
+    A = TI.checkCallingConvention(CC);
+    if (Aux && A != TargetInfo::CCCR_OK)
+      A = Aux->checkCallingConvention(CC);
   } else {
     A = TI.checkCallingConvention(CC);
   }
diff --git a/clang/test/SemaSYCL/Inputs/vectorcall.hpp b/clang/test/SemaSYCL/Inputs/vectorcall.hpp
new file mode 100644
index 0000000000000..566a48dd24c30
--- /dev/null
+++ b/clang/test/SemaSYCL/Inputs/vectorcall.hpp
@@ -0,0 +1,18 @@
+
+template <typename F> struct A{};
+
+template <typename Ret, typename C, typename... Args> struct A<Ret (             C::*)(Args...) noexcept> { static constexpr int value = 0; };
+template <typename Ret, typename C, typename... Args> struct A<Ret (__vectorcall C::*)(Args...) noexcept> { static constexpr int value = 1; };
+
+template <typename F> constexpr int A_v = A<F>::value;
+
+struct B
+{
+    void f() noexcept {}
+    void __vectorcall g() noexcept {}
+};
+
+int main()
+{
+    return A_v<decltype(&B::f)> + A_v<decltype(&B::g)>;
+}
diff --git a/clang/test/SemaSYCL/sycl-cconv-win.cpp b/clang/test/SemaSYCL/sycl-cconv-win.cpp
new file mode 100644
index 0000000000000..f0c22a2ed3921
--- /dev/null
+++ b/clang/test/SemaSYCL/sycl-cconv-win.cpp
@@ -0,0 +1,5 @@
+// RUN: %clang_cc1 -isystem %S/Inputs/ -fsycl-is-device -triple amdgcn-amd-hsa -aux-triple x86_64-pc-windows-msvc -fsyntax-only -verify %s
+
+// expected-no-diagnostics
+
+#include <vectorcall.hpp>

Signed-off-by: Sarnie, Nick <[email protected]>
@sarnex sarnex merged commit 257b727 into llvm:main Apr 18, 2025
11 checks passed
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
The MSVC STL includes specializations of `_Is_memfunptr` for every
function pointer type, including every calling convention.

The problem is the AMDGPU target doesn't support the x86 `vectorcall`
calling convention so clang sets it to the default CC. This ends up
clashing with the already-existing overload for the default CC, so we
get a duplicate definition error when including `type_traits` (which we
heavily use in the SYCL STL) and compiling for AMDGPU on Windows.

This doesn't happen for pure AMDGPU non-SYCL because it doesn't include
the C++ STL, and it doesn't happen for CUDA/HIP because a similar
workaround was done
[here](llvm@fa49c3a).

I am not an expert in Sema, so I did a kinda of hardcoded fix, please
let me know if there is a better way to fix this.

As far as I can tell we can't do exactly the same fix that was done for
CUDA because we can't differentiate between device and host code so
easily.

---------

Signed-off-by: Sarnie, Nick <[email protected]>
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
The MSVC STL includes specializations of `_Is_memfunptr` for every
function pointer type, including every calling convention.

The problem is the AMDGPU target doesn't support the x86 `vectorcall`
calling convention so clang sets it to the default CC. This ends up
clashing with the already-existing overload for the default CC, so we
get a duplicate definition error when including `type_traits` (which we
heavily use in the SYCL STL) and compiling for AMDGPU on Windows.

This doesn't happen for pure AMDGPU non-SYCL because it doesn't include
the C++ STL, and it doesn't happen for CUDA/HIP because a similar
workaround was done
[here](llvm@fa49c3a).

I am not an expert in Sema, so I did a kinda of hardcoded fix, please
let me know if there is a better way to fix this.

As far as I can tell we can't do exactly the same fix that was done for
CUDA because we can't differentiate between device and host code so
easily.

---------

Signed-off-by: Sarnie, Nick <[email protected]>
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
The MSVC STL includes specializations of `_Is_memfunptr` for every
function pointer type, including every calling convention.

The problem is the AMDGPU target doesn't support the x86 `vectorcall`
calling convention so clang sets it to the default CC. This ends up
clashing with the already-existing overload for the default CC, so we
get a duplicate definition error when including `type_traits` (which we
heavily use in the SYCL STL) and compiling for AMDGPU on Windows.

This doesn't happen for pure AMDGPU non-SYCL because it doesn't include
the C++ STL, and it doesn't happen for CUDA/HIP because a similar
workaround was done
[here](llvm@fa49c3a).

I am not an expert in Sema, so I did a kinda of hardcoded fix, please
let me know if there is a better way to fix this.

As far as I can tell we can't do exactly the same fix that was done for
CUDA because we can't differentiate between device and host code so
easily.

---------

Signed-off-by: Sarnie, Nick <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants