Skip to content

[SYCL][CUDA][libclc] Added atomics with scopes and memory orders #4820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 18, 2021

Conversation

t4c1
Copy link
Contributor

@t4c1 t4c1 commented Oct 26, 2021

Added libclc implementations for CUDA atomics, including for various scopes and memory orders. They are implemented using LLVM intrinsics and exposed as clang builtins, which are than used to implement functions in libclc.

Sorry for a huge pull request, but I think for these changes it make sense they are reviewed together.

EDIT: llvm-test-suite PR with tests for this is intel/llvm-test-suite#534

@t4c1 t4c1 changed the title [SYCL][CUDA][libclc] Added atomics with scopes and memory ord… [SYCL][CUDA][libclc] Added atomics with scopes and memory orders Oct 26, 2021
t4c1 added 2 commits October 26, 2021 14:47
# Conflicts:
#	libclc/ptx-nvidiacl/libspirv/SPV_EXT_shader_atomic_float_add/atomicfaddext.cl
bader
bader previously approved these changes Oct 26, 2021
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for a huge pull request, but I think for these changes it make sense they are reviewed together.

Okay to make reviewing process easier it better to split changes into multiple commits. For instance, we are adding support for 10 atomic operations, so one way to split the patch is create a separate commit for each operation. It should be much easier to understand what's going on for one operation, so that once first commit in pull request is reviewed others should be easier to review. Please, consider such approach for future pull requests.

For this one, I suggest uploading clang and llvm project changes (i.e. non-libclc) to reviews.llvm.org to get feedback from NVPTX target maintainers.

@t4c1
Copy link
Contributor Author

t4c1 commented Oct 26, 2021

Okay to make reviewing process easier it better to split changes into multiple commits. For instance, we are adding support for 10 atomic operations, so one way to split the patch is create a separate commit for each operation.

Well I originally had multiple commits, but they were a bit of a mess, since I am learning a lot of this at the same time. I don't really think splitting this by operation makes sense, as a lot of code is common for all operations.

bader
bader previously approved these changes Oct 27, 2021
@bader
Copy link
Contributor

bader commented Oct 27, 2021

/summary:run

Naghasan
Naghasan previously approved these changes Oct 28, 2021
Copy link
Contributor

@Naghasan Naghasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks way better. Thanks.

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but I think it's good overall. That said, I agree with @bader that everything in clang/* and llvm/* should be reviewed and merged upstream (see https://llvm.org/docs/Contributing.html).

TARGET_BUILTIN(__nvvm_atom_cta_acq_rel_cas_gen_ll, "LLiLLiD*LLiLLi", "n", SM_70)
TARGET_BUILTIN(__nvvm_atom_sys_acq_rel_cas_gen_ll, "LLiLLiD*LLiLLi", "n", SM_70)

BUILTIN(__nvvm_atom_add_global_i, "iiD*i", "n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot help but wonder if the builtins could select the right intrinsics without the space explicitly stated in the builtin's name, i.e. inferring it from the pointer's address space. Or maybe even as late as instruction selection?
@Naghasan - Do you have insight into whether or not that is possible?

@dm-vodopyanov dm-vodopyanov added cuda CUDA back-end libclc libclc project related issues labels Nov 8, 2021
@t4c1
Copy link
Contributor Author

t4c1 commented Nov 9, 2021

That said, I agree with @bader that everything in clang/* and llvm/* should be reviewed and merged upstream (see https://llvm.org/docs/Contributing.html).

I requested a review, but forgot to post a link to it. It is here: https://reviews.llvm.org/D112718

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this, so I cannot comment on the functionality. FE changes LGTM otherwise. Please wait for review from someone more familiar with this before merge. If this is merged to llvm-project instead, please apply review comments to patch uploaded there.

smanna12
smanna12 previously approved these changes Nov 12, 2021
Copy link
Contributor

@smanna12 smanna12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE Changes look good to me.

@bader
Copy link
Contributor

bader commented Nov 16, 2021

@t4c1, it looks like the build is broken. Could you take a look at the failures here: http://ci.llvm.intel.com:8010/#builders/37/builds/14729, please?

@t4c1 t4c1 dismissed stale reviews from smanna12 and elizabethandrews via 1c3bc41 November 17, 2021 08:19
@t4c1
Copy link
Contributor Author

t4c1 commented Nov 17, 2021

Thank you for the reminder. I forgot to make changes in libclc after modifying the builtins. Should be fixed now.

@bader
Copy link
Contributor

bader commented Nov 17, 2021

/summary:run

@bader bader merged commit 2ebde5f into intel:sycl Nov 18, 2021
bader pushed a commit that referenced this pull request Jan 26, 2022
…4853)

Updates returns for atomics memory order and scope capabilities queries to make them in line with changes in #4820. 

This includes adding the previously not existing option to query for atomic scope capabilities.
bader pushed a commit to intel/llvm-test-suite that referenced this pull request Feb 24, 2022
)

Added tests for atomics with various memory orders and scopes. Reductions tests also have updated sm requirements, as they call work group atomics, which are now implemented and have higher sm requirements than device scoped ones.

This adds tests for changes introduced in intel/llvm#4820 and intel/llvm#5192.
@t4c1 t4c1 deleted the ptx_atomics2 branch March 15, 2022 08:51
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Mar 27, 2023
…ntel/llvm-test-suite#534)

Added tests for atomics with various memory orders and scopes. Reductions tests also have updated sm requirements, as they call work group atomics, which are now implemented and have higher sm requirements than device scoped ones.

This adds tests for changes introduced in intel#4820 and intel#5192.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda CUDA back-end libclc libclc project related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants