-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][libclc][NATIVECPU] Implement generic atomic load for generic target #13249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Is it safe to have explicit mangling for libclc/generic with AS ptr params? As far as I am aware, the mangling of generic AS pointers can depend on the target that it is being compiled for. This patch does not add builtins for the generic AS, so maybe this is not a concern. My main question is: do we have guarantees that the mangling of global and local AS ptrs will always be the same, regardless of the target being compiled for? I think the answer is yes, but would like to confirm. Tagging @frasercrmck for context |
Thanks @hdelan, you raise a good point. My understanding is that in general mangling is target dependent, and in particular when it comes to address spaces, the mangling depends in the Address Space Map used by the compiler front end, and this applies to all address spaces, not just the generic one. So yeah here I think we are assuming that the generic AS is mapped to 0 when mangling, which is probably not too different from other assumptions that are already made in libclc ( |
Yeah, it's a bit tricky. All manglings are ultimately up to the target, and may depend on other options (see AMDGPU changing the address space mapping depending on whether We also have the problem in I don't have a solid recommendation, but the generic address space can be mangled in three different ways on AMDGPU ( My gut feeling is that we shouldn't have any hard-coded address space manglings in the |
|
||
IMPL_AS(int, i, , 4) | ||
IMPL_AS(unsigned int, j, u, 4) | ||
|
||
IMPL(unsigned int, j, global, , u, 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this really be global
? I thought the PR was for generic
AS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting it, I copy-pasted the line and didn't notice it. I changed to a blank in the call to the macro since OpenCL 3.0 doesn't allot using generic
as an AS qualifier, I've also added implementations of the clc*
load helpers that use addrspace(0)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with nits. As discussed I'm not a fan of manual mangling (especially in the generic builtins) but we are already in that situation, so this PR isn't making anything worse. We'll investigate ways of improving this, I think.
@@ -18,7 +18,7 @@ TYPE __clc__atomic_##PREFIX##load_##AS##_##BYTE_SIZE##_##MEM_ORDER(volatile AS c | |||
FDECL(TYPE, PREFIX, AS, BYTE_SIZE, acquire) \ | |||
FDECL(TYPE, PREFIX, AS, BYTE_SIZE, seq_cst) \ | |||
_CLC_DEF TYPE \ | |||
_Z18__spirv_AtomicLoadPU3##AS_MANGLED##K##TYPE_MANGLED##N5__spv5Scope4FlagENS1_19MemorySemanticsMask4FlagE( \ | |||
_Z18__spirv_AtomicLoadP##AS_MANGLED##K##TYPE_MANGLED##N5__spv5Scope4FlagENS1_19MemorySemanticsMask4FlagE( \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting looks like it could be improved - that \
should be aligned with the others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've aligned the \
at the end of the lines, thank you
@@ -31,8 +31,9 @@ TYPE __clc__atomic_##PREFIX##load_##AS##_##BYTE_SIZE##_##MEM_ORDER(volatile AS c | |||
} | |||
|
|||
#define IMPL_AS(TYPE, TYPE_MANGLED, PREFIX, BYTE_SIZE) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you able to format this whole expression while you're here? It'd be nice to slowly improve the formatting of libclc as we go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
@intel/llvm-gatekeepers this looks ready to be merged, thank you |
…13428) Implements `__spirv_AtomicStore` similarly to #13249. Note that the `IMPL` macro has been extended to take in a `SUB` parameter, similarly to what happens for [amdgcn](https://github.com/intel/llvm/blob/a5a0e1296269195de90949537597b2788bb5e836/libclc/amdgcn-amdhsa/libspirv/atomic/atomic_store.cl#L13) and [ptx](https://github.com/intel/llvm/blob/a5a0e1296269195de90949537597b2788bb5e836/libclc/ptx-nvidiacl/libspirv/atomic/atomic_store.cl#L39).
This PR implements the overload for the generic address space for
__spirv_AtomicLoad
in thegeneric
target.Libclc implements overloads for the generic address space for
__spirv_AtomicLoad
(and several other builtins) in the ptx and amdgcn targets, but doesn't do so forgeneric
.I've created this PR to gather some initial feedback on the implementation, I'd like to add implementations for other builtins with follow up PRs.