Skip to content

Commit eaff1cf

Browse files
authored
[libclc][cuda] CTS fix: CUDA backend uses "success" atomic order for cas. (#12502)
CTS fix: CUDA backend uses "success" atomic order for cas. There was a bug in the cas impl for nvptx in libclc that lead to CTS test failures for the CUDA backend. This fixes the bug in a simple way by simply replacing the cases where the failure order differs from the success order (when failure order is either `release` or `acquire`), so that the failure order matches the success order (`acq_rel`). This is safe even if the cas performs the failure operation, because acq_rel can be used for both acquire (load) and release (store) atomic ops in ptx. I think that this is the only valid way to implement cas for nvptx, because the cas operation only takes one order argument. Now the sycl cts passes for acq_rel atomics for the cuda backend. Signed-off-by: JackAKirk <[email protected]>
1 parent 78affac commit eaff1cf

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

libclc/ptx-nvidiacl/libspirv/atomic/atomic_cmpxchg.cl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ SemanticsMask4FlagES##SUBSTITUTION2##_##TYPE_MANGLED##TYPE_MANGLED( \
8686
enum MemorySemanticsMask semantics2, TYPE cmp, TYPE value) { \
8787
/* Semantics mask may include memory order, storage class and other info \
8888
Memory order is stored in the lowest 5 bits */ \
89-
unsigned int order = (semantics1 | semantics2) & 0x1F; \
89+
unsigned int order = semantics1 & 0x1F; \
9090
switch (order) { \
9191
case None: \
9292
__CLC_NVVM_ATOMIC_CAS_IMPL_ORDER(TYPE, TYPE_NV, TYPE_MANGLED_NV, OP, \

0 commit comments

Comments
 (0)