Skip to content

Commit 09b4fe3

Browse files
committed
[SYCL][CUDA] Add missing barrier to collectives
SYCL sub-group and group functions should act as synchronization points. Group collectives need a barrier at the end to ensure that back-to-back collectives do not lead to a race condition. Note that the barrier at the beginning of each collective occurs after each work-item writes its partial results to the scratch space. This is assumed safe because only the collective functions can access the space, and collective functions must be encountered in uniform control flow; any work-item encountering a collective function can assume it is safe to use the scratch space, because all work-items in the same work-group must have either executed no collective functions or the barrier at the end of a previous collective function. Signed-off-by: John Pennycook <[email protected]>
1 parent ccd19d5 commit 09b4fe3

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

libclc/ptx-nvidiacl/libspirv/group/collectives.cl

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -264,6 +264,7 @@ __CLC_SUBGROUP_COLLECTIVE(FMax, __CLC_MAX, double, -DBL_MAX)
264264
result = OP(sg_x, scratch[sg_id - 1]); \
265265
} \
266266
} \
267+
__spirv_ControlBarrier(Workgroup, 0, 0); \
267268
return result; \
268269
}
269270

0 commit comments

Comments
 (0)