[libclc][cuda] CTS fix: CUDA backend uses "success" atomic order for cas. #12502

JackAKirk · 2024-01-25T20:47:37Z

CTS fix: CUDA backend uses "success" atomic order for cas.

There was a bug in the cas impl for nvptx in libclc that lead to CTS test failures for the CUDA backend.
This fixes the bug in a simple way by simply replacing the cases where the failure order differs from the success order (when failure order is either release or acquire), so that the failure order matches the success order (acq_rel). This is safe even if the cas performs the failure operation, because acq_rel can be used for both acquire (load) and release (store) atomic ops in ptx. I think that this is the only valid way to implement cas for nvptx, because the cas operation only takes one order argument.
Now the sycl cts passes for acq_rel atomics for the cuda backend.

This is safe because acq_rel can be used for acquire and release in ptx. This fixes a bug in a simple way and now the sycl cts passes for acq_rel atomics for the cuda backend. Signed-off-by: JackAKirk <[email protected]>

ldrumm

Looks good. Do you have any references to NVIDIA docs that might back up your assertion of safety?

JackAKirk · 2024-01-26T12:13:22Z

Looks good. Do you have any references to NVIDIA docs that might back up your assertion of safety?

From https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#id70

"acquire operation: A memory operation with .acquire or .acq_rel qualifier."
"release operation: A memory operation with .release or .acq_rel qualifier."

For the case that failure is stronger than success. Justifying still using success order:

From SYCL spec on definition of memory consistency model behavior:

"The SYCL memory consistency model is based upon the memory consistency model of the C++ core language. Where SYCL offers extensions to classes and functions that may affect memory consistency, the default behavior when these extensions are not used always matches the behavior of standard C++."

and then a page later:

"
sycl::memory_order::acquire;
sycl::memory_order::release;
sycl::memory_order::acq_rel;
sycl::memory_order::seq_cst.
The meanings of these values are identical to those defined in the C++ core language.
"

Then looking at C++ definition:
https://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange

"If failure is stronger than success or(until C++17) is one of std::memory_order_release and std::memory_order_acq_rel, the behavior is undefined."

Therefore the mapping that I described in the commit message, from the C++/SYCL interface (assuming default scope from atomic_ref constructor to satisfy the clause "the default behavior when these extensions are not used always matches the behavior of standard C++") to ptx interfaces:
compare_exchange_strong( T& expected, T desired, std::memory_order success, std::memory_order failure )

to atom{.sem}{.scope}{.space}.cas

Is I think the most sensible mapping that still satisfies the sycl function definition.

Cuda backend uses success order in all cases for cas.

95bfa37

This is safe because acq_rel can be used for acquire and release in ptx. This fixes a bug in a simple way and now the sycl cts passes for acq_rel atomics for the cuda backend. Signed-off-by: JackAKirk <[email protected]>

JackAKirk requested a review from a team as a code owner January 25, 2024 20:47

JackAKirk requested a review from steffenlarsen January 25, 2024 20:47

JackAKirk temporarily deployed to WindowsCILock January 25, 2024 21:48 — with GitHub Actions Inactive

JackAKirk temporarily deployed to WindowsCILock January 25, 2024 22:18 — with GitHub Actions Inactive

ldrumm approved these changes Jan 26, 2024

View reviewed changes

steffenlarsen merged commit eaff1cf into intel:sycl Jan 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[libclc][cuda] CTS fix: CUDA backend uses "success" atomic order for cas. #12502

[libclc][cuda] CTS fix: CUDA backend uses "success" atomic order for cas. #12502

Uh oh!

JackAKirk commented Jan 25, 2024

Uh oh!

ldrumm left a comment •

edited

Loading

Uh oh!

JackAKirk commented Jan 26, 2024 •

edited

Loading

Uh oh!

Uh oh!

[libclc][cuda] CTS fix: CUDA backend uses "success" atomic order for cas. #12502

[libclc][cuda] CTS fix: CUDA backend uses "success" atomic order for cas. #12502

Uh oh!

Conversation

JackAKirk commented Jan 25, 2024

Uh oh!

ldrumm left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JackAKirk commented Jan 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ldrumm left a comment •

edited

Loading

JackAKirk commented Jan 26, 2024 •

edited

Loading