-
Notifications
You must be signed in to change notification settings - Fork 787
[clang][Sema][SYCL] Fix SYCL AMDGPU compilation on Windows #17983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Sarnie, Nick <[email protected]>
Why not? https://godbolt.org/z/Y1q97Tj79 |
Ah oh thanks, I thought it wasn't reproducible. @Fznamznon @intel/dpcpp-cfe-reviewers Do you guys mind reviewing this before I propose upstream? I don't usually work in this area so I don't really know if the fix is the best. |
@@ -5574,6 +5574,12 @@ bool Sema::CheckCallingConvAttr(const ParsedAttr &Attrs, CallingConv &CC, | |||
A = HostTI->checkCallingConvention(CC); | |||
if (A == TargetInfo::CCCR_OK && CheckDevice && DeviceTI) | |||
A = DeviceTI->checkCallingConvention(CC); | |||
} else if (LangOpts.SYCLIsDevice && (Aux && Aux->getTriple().isOSWindows()) && | |||
TI.getTriple().isAMDGPU() && | |||
getSourceManager().isInSystemHeader(Attrs.getScopeLoc())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really know what will happen if the function with a wrong calling convention will reach codegen for the device. But in general, I would not limit this fix to windows and system headers because in SYCL the users are free to do everything in the host code. To prevent errors from device code emission, we usually use DiagIfDeviceCode functions, but I'm not really a fan of them due to their bugginess .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me remove that part and propose it upstream, thanks
The MSVC STL includes specializations of
_Is_memfunptr
for every function pointer type, including every calling convention.The problem is the AMDGPU target doesn't support the
vectorcall
calling convention so clang sets it to the default CC. This ends up clashing with the already-existing overload for the default CC, so we get a duplicate definiton error when includingtype_traits
(which we heavily use in the SYCL STL) and compiling for AMDGPU on Windows.This doesn't happen for pure AMDGPU non-SYCL because it doesn't include the C++ STL, and it doesn't happen for CUDA/HIP because a similar workaround was done upstream here.
I am submitting this here as opposed to upstream as upstream SYCL doesn't hit the issue due to lacking AMDGPU SYCL support.
I am not an expert in Sema, so I tried to make the fix extremely targeted. If there is a better fix let me know. As far as I can tell we can't do exactly the same fix that was done for CUDA because we can't differentiate between device and host code so easily.
Closes: #7738