Skip to content

Commit d981d37

Browse files
committed
[SYCL][Fusion][Doc] Document reductions support
Document reduction strategies supported by kernel fusion and how users should use reductions in their code. Signed-off-by: Victor Perez <[email protected]>
1 parent 2c85e99 commit d981d37

File tree

1 file changed

+33
-1
lines changed

1 file changed

+33
-1
lines changed

sycl/doc/design/KernelFusionJIT.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -313,12 +313,44 @@ computing the private memory size. As range rounding only applies to basic
313313
kernels (parametrized by a `sycl::range`), local internalization is not affected
314314
by the range rounding transformation.
315315

316+
### Reductions
317+
318+
Kernel fusion of reductions is partially supported. In order to preserve the
319+
legality of the fused kernel, i.e., the fact that fused kernel must perform the
320+
same work as the graph of kernels to be fused, only the fusion of following
321+
reduction strategies at the time of writing is supported:
322+
323+
- `group_reduce_and_last_wg_detection`
324+
- `local_atomic_and_atomic_cross_wg`
325+
- `range_basic`
326+
- `group_reduce_and_atomic_cross_wg`
327+
- `local_mem_tree_and_atomic_cross_wg`
328+
329+
Other strategies require implicit inter-work-group synchronization, not
330+
supported in kernel fusion.
331+
332+
This way, users should not use `sycl::reduction` directly when performing kernel
333+
fusion in their code, as an unsupported algorithm might be chosen. They should
334+
instead use `sycl::detail::reduction_parallel_for`, forcing a supported fusion
335+
strategy. Reductions implementation in
336+
[`sycl/reduction.hpp`](../../include/sycl/reduction.hpp) might give users an
337+
insight into which kind of reductions to use for their purposes:
338+
339+
```c++
340+
q.submit([&](sycl::handler &cgh) {
341+
sycl::accessor in(dataBuf, cgh, sycl::read_only);
342+
sycl::reduction sum(sumBuf, cgh, sycl::plus<>{});
343+
// Force supported 'group_reduce_and_last_wg_detection' strategy
344+
sycl::detail::reduction_parallel_for<sycl::detail::auto_name,
345+
sycl::detail::strategy::group_reduce_and_last_wg_detection>(...);
346+
});
347+
```
348+
316349
### Unsupported SYCL constructs
317350

318351
The following SYCL API constructs are currently not officially supported for
319352
kernel fusion and should be considered untested/unsupported:
320353

321-
- Reductions
322354
- `sycl::stream`
323355
- Specialization constants and `sycl::kernel_handler`
324356
- Images (`sycl::unsampled_image` and `sycl::sampled_image`)

0 commit comments

Comments
 (0)