[SYCL][CUDA][libclc] Added atomics with scopes and memory orders #4820

t4c1 · 2021-10-26T12:43:55Z

Added libclc implementations for CUDA atomics, including for various scopes and memory orders. They are implemented using LLVM intrinsics and exposed as clang builtins, which are than used to implement functions in libclc.

Sorry for a huge pull request, but I think for these changes it make sense they are reviewed together.

EDIT: llvm-test-suite PR with tests for this is intel/llvm-test-suite#534

…ers in libclc for CUDA.

# Conflicts: # libclc/ptx-nvidiacl/libspirv/SPV_EXT_shader_atomic_float_add/atomicfaddext.cl

bader

Sorry for a huge pull request, but I think for these changes it make sense they are reviewed together.

Okay to make reviewing process easier it better to split changes into multiple commits. For instance, we are adding support for 10 atomic operations, so one way to split the patch is create a separate commit for each operation. It should be much easier to understand what's going on for one operation, so that once first commit in pull request is reviewed others should be easier to review. Please, consider such approach for future pull requests.

For this one, I suggest uploading clang and llvm project changes (i.e. non-libclc) to reviews.llvm.org to get feedback from NVPTX target maintainers.

llvm/test/CodeGen/NVPTX/atomics-with-semantics.ll

libclc/ptx-nvidiacl/libspirv/atomic/atomic_helpers.h

libclc/ptx-nvidiacl/libspirv/atomic/atomic_inc_dec_helpers.h

libclc/ptx-nvidiacl/libspirv/atomic/loadstore_helpers.ll

libclc/ptx-nvidiacl/libspirv/atomic/atomic_cmpxchg.cl

t4c1 · 2021-10-26T13:27:53Z

Okay to make reviewing process easier it better to split changes into multiple commits. For instance, we are adding support for 10 atomic operations, so one way to split the patch is create a separate commit for each operation.

Well I originally had multiple commits, but they were a bit of a mess, since I am learning a lot of this at the same time. I don't really think splitting this by operation makes sense, as a lot of code is common for all operations.

This reverts part of commit 7d29406.

Co-authored-by: Alexey Bader <[email protected]>

llvm/test/CodeGen/NVPTX/atomics-with-semantics.ll

bader · 2021-10-27T11:44:05Z

/summary:run

…tics.ll as suggested by review comment

Naghasan

Looks way better. Thanks.

clang/lib/CodeGen/CGBuiltin.cpp

steffenlarsen

A few comments, but I think it's good overall. That said, I agree with @bader that everything in clang/* and llvm/* should be reviewed and merged upstream (see https://llvm.org/docs/Contributing.html).

libclc/ptx-nvidiacl/libspirv/atomic/atomic_helpers.h

clang/include/clang/Basic/BuiltinsNVPTX.def

steffenlarsen · 2021-11-05T12:12:32Z

clang/include/clang/Basic/BuiltinsNVPTX.def

+TARGET_BUILTIN(__nvvm_atom_cta_acq_rel_cas_gen_ll, "LLiLLiD*LLiLLi", "n", SM_70)
+TARGET_BUILTIN(__nvvm_atom_sys_acq_rel_cas_gen_ll, "LLiLLiD*LLiLLi", "n", SM_70)
+
+BUILTIN(__nvvm_atom_add_global_i, "iiD*i", "n")


I cannot help but wonder if the builtins could select the right intrinsics without the space explicitly stated in the builtin's name, i.e. inferring it from the pointer's address space. Or maybe even as late as instruction selection?
@Naghasan - Do you have insight into whether or not that is possible?

t4c1 · 2021-11-09T12:19:20Z

That said, I agree with @bader that everything in clang/* and llvm/* should be reviewed and merged upstream (see https://llvm.org/docs/Contributing.html).

I requested a review, but forgot to post a link to it. It is here: https://reviews.llvm.org/D112718

elizabethandrews

I'm not familiar with this, so I cannot comment on the functionality. FE changes LGTM otherwise. Please wait for review from someone more familiar with this before merge. If this is merged to llvm-project instead, please apply review comments to patch uploaded there.

smanna12

FE Changes look good to me.

bader · 2021-11-16T16:36:16Z

@t4c1, it looks like the build is broken. Could you take a look at the failures here: http://ci.llvm.intel.com:8010/#builders/37/builds/14729, please?

… names

t4c1 · 2021-11-17T08:20:47Z

Thank you for the reminder. I forgot to make changes in libclc after modifying the builtins. Should be fixed now.

bader · 2021-11-17T08:23:08Z

/summary:run

…4853) Updates returns for atomics memory order and scope capabilities queries to make them in line with changes in #4820. This includes adding the previously not existing option to query for atomic scope capabilities.

) Added tests for atomics with various memory orders and scopes. Reductions tests also have updated sm requirements, as they call work group atomics, which are now implemented and have higher sm requirements than device scoped ones. This adds tests for changes introduced in intel/llvm#4820 and intel/llvm#5192.

…ntel/llvm-test-suite#534) Added tests for atomics with various memory orders and scopes. Reductions tests also have updated sm requirements, as they call work group atomics, which are now implemented and have higher sm requirements than device scoped ones. This adds tests for changes introduced in intel#4820 and intel#5192.

[SYCL][CUDA][libclc] Added atomics with various scopes and memory ord…

03e0068

…ers in libclc for CUDA.

t4c1 requested review from bader, elizabethandrews, premanandrao and smanna12 as code owners October 26, 2021 12:43

t4c1 changed the title ~~[SYCL][CUDA][libclc] Added atomics with scopes and memory ord…~~ [SYCL][CUDA][libclc] Added atomics with scopes and memory orders Oct 26, 2021

t4c1 added 2 commits October 26, 2021 14:47

Merge branch 'sycl' into ptx_atomics2

ad3e28e

# Conflicts: # libclc/ptx-nvidiacl/libspirv/SPV_EXT_shader_atomic_float_add/atomicfaddext.cl

[SYCL][CUDA][libclc] format

7d29406

bader previously approved these changes Oct 26, 2021

View reviewed changes

t4c1 mentioned this pull request Oct 26, 2021

[SYCL] Added tests for atomics with various memory orders and scopes intel/llvm-test-suite#534

Merged

bader dismissed their stale review via 7d29406 October 26, 2021 19:16

t4c1 and others added 2 commits October 27, 2021 08:42

Revert "[SYCL][CUDA][libclc] format" for loadstore_helpers.ll

29e4457

This reverts part of commit 7d29406.

Apply suggestions from code review

13fa2a9

Co-authored-by: Alexey Bader <[email protected]>

bader previously approved these changes Oct 27, 2021

View reviewed changes

Naghasan requested changes Oct 27, 2021

View reviewed changes

llvm/test/CodeGen/NVPTX/atomics-with-semantics.ll Outdated Show resolved Hide resolved

[SYCL][CUDA][libclc] fixed llvm/test/CodeGen/NVPTX/atomics-with-seman…

4ca997b

…tics.ll as suggested by review comment

t4c1 dismissed bader’s stale review via 4ca997b October 28, 2021 12:14

Naghasan previously approved these changes Oct 28, 2021

View reviewed changes

t4c1 mentioned this pull request Oct 29, 2021

[SYCL][PI][CUDA] Update queries for atomic order and scope for CUDA #4853

Merged

steffenlarsen self-requested a review November 1, 2021 11:47

elizabethandrews reviewed Nov 1, 2021

View reviewed changes

clang/lib/CodeGen/CGBuiltin.cpp Outdated Show resolved Hide resolved

steffenlarsen reviewed Nov 5, 2021

View reviewed changes

dm-vodopyanov added cuda CUDA back-end libclc libclc project related issues labels Nov 8, 2021

t4c1 added 2 commits November 9, 2021 11:12

refactor and cleanup

a3d1782

reverset the order of semantics and scope

87206dc

t4c1 dismissed Naghasan’s stale review via 87206dc November 9, 2021 11:24

format

f89b6ce

elizabethandrews previously approved these changes Nov 10, 2021

View reviewed changes

smanna12 previously approved these changes Nov 12, 2021

View reviewed changes

[SYCL][CUDA][libclc] fixed calls from libclc after change in builtins…

1c3bc41

… names

t4c1 dismissed stale reviews from smanna12 and elizabethandrews via 1c3bc41 November 17, 2021 08:19

bader approved these changes Nov 17, 2021

View reviewed changes

smanna12 approved these changes Nov 17, 2021

View reviewed changes

bader merged commit 2ebde5f into intel:sycl Nov 18, 2021

ghost mentioned this pull request Nov 22, 2021

[SYCL][CUDA] Fatal error: error in backend: Cannot select: intrinsic %llvm.nvvm.atomic.add.shared.i.cta #5008

Closed

vladimirlaz mentioned this pull request Nov 24, 2021

[SYCL] fix interop queue to not take ownership intel/llvm-test-suite#577

Merged

t4c1 deleted the ptx_atomics2 branch March 15, 2022 08:51

ldrumm mentioned this pull request Jan 9, 2023

Implement AtomicFAddEXT for the CUDA BE #2853

Closed

[SYCL][CUDA][libclc] Added atomics with scopes and memory orders #4820

[SYCL][CUDA][libclc] Added atomics with scopes and memory orders #4820

Uh oh!

Conversation

t4c1 commented Oct 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bader left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

t4c1 commented Oct 26, 2021

Uh oh!

Uh oh!

bader commented Oct 27, 2021

Uh oh!

Naghasan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steffenlarsen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

steffenlarsen Nov 5, 2021

Choose a reason for hiding this comment

Uh oh!

t4c1 commented Nov 9, 2021

Uh oh!

elizabethandrews left a comment

Choose a reason for hiding this comment

Uh oh!

smanna12 left a comment

Choose a reason for hiding this comment

Uh oh!

bader commented Nov 16, 2021

Uh oh!

t4c1 commented Nov 17, 2021

Uh oh!

bader commented Nov 17, 2021

Uh oh!

Uh oh!

t4c1 commented Oct 26, 2021 •

edited

Loading