@@ -313,12 +313,44 @@ computing the private memory size. As range rounding only applies to basic
313
313
kernels (parametrized by a ` sycl::range ` ), local internalization is not affected
314
314
by the range rounding transformation.
315
315
316
+ ### Reductions
317
+
318
+ Kernel fusion of reductions is partially supported. In order to preserve the
319
+ legality of the fused kernel, i.e., the fact that fused kernel must perform the
320
+ same work as the graph of kernels to be fused, only the fusion of following
321
+ reduction strategies at the time of writing is supported:
322
+
323
+ - ` group_reduce_and_last_wg_detection `
324
+ - ` local_atomic_and_atomic_cross_wg `
325
+ - ` range_basic `
326
+ - ` group_reduce_and_atomic_cross_wg `
327
+ - ` local_mem_tree_and_atomic_cross_wg `
328
+
329
+ Other strategies require implicit inter-work-group synchronization, not
330
+ supported in kernel fusion.
331
+
332
+ This way, users should not use ` sycl::reduction ` directly when performing kernel
333
+ fusion in their code, as an unsupported algorithm might be chosen. They should
334
+ instead use ` sycl::detail::reduction_parallel_for ` , forcing a supported fusion
335
+ strategy. Reductions implementation in
336
+ [ ` sycl/reduction.hpp ` ] ( ../../include/sycl/reduction.hpp ) might give users an
337
+ insight into which kind of reductions to use for their purposes:
338
+
339
+ ``` c++
340
+ q.submit([&](sycl::handler &cgh) {
341
+ sycl::accessor in(dataBuf, cgh, sycl::read_only);
342
+ sycl::reduction sum(sumBuf, cgh, sycl::plus<>{});
343
+ // Force supported 'group_reduce_and_last_wg_detection' strategy
344
+ sycl::detail::reduction_parallel_for<sycl::detail::auto_name,
345
+ sycl::detail::strategy::group_reduce_and_last_wg_detection>(...);
346
+ });
347
+ ```
348
+
316
349
### Unsupported SYCL constructs
317
350
318
351
The following SYCL API constructs are currently not officially supported for
319
352
kernel fusion and should be considered untested/unsupported:
320
353
321
- - Reductions
322
354
- ` sycl::stream `
323
355
- Specialization constants and ` sycl::kernel_handler `
324
356
- Images (` sycl::unsampled_image ` and ` sycl::sampled_image ` )
0 commit comments