Skip to content

Commit e92bf52

Browse files
Pennycookgmlueck
andauthored
[SYCL][Doc] Add reduction properties extension (#15213)
Defines new compile-time properties that provide developers with greater control over how a reduction may be implemented. Closes #1621 . --------- Signed-off-by: John Pennycook <[email protected]> Co-authored-by: Greg Lueck <[email protected]>
1 parent 7f9e251 commit e92bf52

File tree

1 file changed

+246
-0
lines changed

1 file changed

+246
-0
lines changed
Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
= sycl_ext_oneapi_reduction_properties
2+
3+
:source-highlighter: coderay
4+
:coderay-linenums-mode: table
5+
6+
// This section needs to be after the document title.
7+
:doctype: book
8+
:toc2:
9+
:toc: left
10+
:encoding: utf-8
11+
:lang: en
12+
:dpcpp: pass:[DPC++]
13+
:endnote: &#8212;{nbsp}end{nbsp}note
14+
15+
// Set the default source code type in this document to C++,
16+
// for syntax highlighting purposes. This is needed because
17+
// docbook uses c++ and html5 uses cpp.
18+
:language: {basebackend@docbook:c++:cpp}
19+
20+
21+
== Notice
22+
23+
[%hardbreaks]
24+
Copyright (C) 2024 Intel Corporation. All rights reserved.
25+
26+
Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks
27+
of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by
28+
permission by Khronos.
29+
30+
31+
== Contact
32+
33+
To report problems with this extension, please open a new issue at:
34+
35+
https://github.com/intel/llvm/issues
36+
37+
38+
== Dependencies
39+
40+
This extension is written against the SYCL 2020 revision 9 specification. All
41+
references below to the "core SYCL specification" or to section numbers in the
42+
SYCL specification refer to that revision.
43+
44+
This extension also depends on the following other SYCL extensions:
45+
46+
* link:../experimental/sycl_ext_oneapi_properties.asciidoc[
47+
sycl_ext_oneapi_properties]
48+
49+
50+
== Status
51+
52+
This is a proposed extension specification, intended to gather community
53+
feedback. Interfaces defined in this specification may not be implemented yet
54+
or may be in a preliminary state. The specification itself may also change in
55+
incompatible ways before it is finalized. *Shipping software products should
56+
not rely on APIs defined in this specification.*
57+
58+
59+
== Overview
60+
61+
In order to maximize portability across different device types, the SYCL 2020
62+
`reduction` interface gives implementers a significant amount of freedom in
63+
selecting the correct reduction algorithm to use for different types.
64+
65+
In the majority of cases, a developer can trust an implementation to choose the
66+
best algorithm; however, there are situations in which a user may wish to
67+
constrain algorithm selection (e.g., to ensure run-to-run reproducibility).
68+
This extension introduces new compile-time properties for the `reduction`
69+
interface that enable developers to provide such constraints.
70+
71+
72+
== Specification
73+
74+
=== Feature test macro
75+
76+
This extension provides a feature-test macro as described in the core SYCL
77+
specification. An implementation supporting this extension must predefine the
78+
macro `SYCL_EXT_ONEAPI_REDUCTION_PROPERTIES` to one of the values defined in
79+
the table below. Applications can test for the existence of this macro to
80+
determine if the implementation supports this feature, or applications can test
81+
the macro's value to determine which of the extension's features the
82+
implementation supports.
83+
84+
[%header,cols="1,5"]
85+
|===
86+
|Value
87+
|Description
88+
89+
|1
90+
|Initial version of this extension.
91+
|===
92+
93+
=== `reduction` overload
94+
95+
New `reduction` overloads are introduced to allow developers to attach
96+
compile-time properties to a reduction object.
97+
98+
Each new overload has the same behavior as its corresponding definition in the
99+
SYCL 2020 specification unless the definition of a property passed in via the
100+
final `sycl::ext::oneapi::experimental::properties` parameter says otherwise.
101+
102+
[source,c++]
103+
----
104+
namespace sycl {
105+
106+
template <typename BufferT, typename BinaryOperation, typename PropertyList>
107+
__unspecified__ reduction(BufferT vars, handler& cgh, BinaryOperation combiner,
108+
PropertyList properties);
109+
110+
template <typename T, typename BinaryOperation, typename PropertyList>
111+
__unspecified__ reduction(T* var, BinaryOperation combiner,
112+
PropertyList properties);
113+
114+
template <typename T, typename Extent, typename BinaryOperation, typename PropertyList>
115+
__unspecified__ reduction(span<T, Extent> vars, BinaryOperation combiner,
116+
PropertyList properties);
117+
118+
template <typename BufferT, typename BinaryOperation, typename PropertyList>
119+
__unspecified__
120+
reduction(BufferT vars, handler& cgh, const BufferT::value_type& identity,
121+
BinaryOperation combiner, PropertyList properties);
122+
123+
template <typename T, typename BinaryOperation, typename PropertyList>
124+
__unspecified__ reduction(T* var, const T& identity, BinaryOperation combiner,
125+
PropertyList properties);
126+
127+
template <typename T, typename Extent, typename BinaryOperation, typename PropertyList>
128+
__unspecified__ reduction(span<T, Extent> vars, const T& identity,
129+
BinaryOperation combiner,
130+
PropertyList properties);
131+
132+
}
133+
----
134+
135+
=== Reduction properties
136+
137+
New `reduction` properties are introduced to allow developers to constrain
138+
reduction algorithm selection based on desired behavior(s). Compile-time
139+
properties corresponding to existing runtime properties are also introduced to
140+
ensure that all information can be passed via a single property list.
141+
142+
If a reduction kernel is submitted to a device that cannot satisfy the
143+
request for specific reduction behavior(s), the implementation must throw an
144+
`exception` with the `errc::feature_not_supported` error code.
145+
146+
[source,c++]
147+
----
148+
namespace sycl::ext::oneapi {
149+
150+
struct deterministic_key {
151+
using value_t = property_value<deterministic_key>;
152+
};
153+
inline constexpr deterministic_key::value_t deterministic;
154+
155+
struct initialize_to_identity_key {
156+
using value_t = property_value<initialize_to_identity_key>;
157+
};
158+
inline constexpr initialize_to_identity_key::value_t initialize_to_identity;
159+
160+
}
161+
----
162+
163+
|===
164+
|Property|Description
165+
166+
|`deterministic`
167+
a|When two reductions both have this property, they are guaranteed to produce
168+
the same result when all of the following conditions hold:
169+
170+
* Both reductions run on the same device.
171+
* Both reductions are invoked with the same launch configuration (i.e., `range`
172+
or `nd_range`).
173+
* The same values are contributed to each reduction.
174+
* The work-items in each reduction contribute those values in the same pattern
175+
and the same order. For example, if the first reduction contributes values
176+
_V1_, _V2_, and _V3_ (in that order) from a work-item with linear index _i_;
177+
then the second reduction must also contribute values _V1_, _V2_, and _V3_
178+
(in that order) from the work-item with linear index _i_.
179+
180+
[_Note:_ Work-items may contribute different values to a reduction because of
181+
other potential sources of non-determinism, such as calls to group algorithms,
182+
use of atomic operations, etc. _{endnote}_]
183+
184+
|`initialize_to_identity`
185+
|Adds the same requirement as
186+
`sycl::property::reduction::initialize_to_identity`.
187+
188+
|===
189+
190+
191+
=== Usage example
192+
193+
[source,c++]
194+
----
195+
using syclex = sycl::ext::oneapi::experimental;
196+
197+
float sum(sycl::queue q, float* input, size_t N) {
198+
199+
float result = 0;
200+
{
201+
sycl::buffer<float> buf{&result, 1};
202+
203+
q.submit([&](sycl::handler& h) {
204+
auto reduction = sycl::reduction(buf, h, sycl::plus<>(), syclex::properties(syclex::deterministic));
205+
h.parallel_for(N, reduction, [=](size_t i, auto& reducer) {
206+
reducer += input[i];
207+
});
208+
}
209+
}
210+
return result;
211+
212+
}
213+
214+
...
215+
216+
float x = sum(q, array, 1024);
217+
float y = sum(q, array, 1024);
218+
219+
// NB: determinism guarantees bitwise reproducible reductions for floats
220+
assert(sycl::bit_cast<unsigned int>(x) == sycl::bit_cast<unsigned int>(y));
221+
----
222+
223+
224+
== Implementation notes
225+
226+
This non-normative section provides information about one possible
227+
implementation of this extension. It is not part of the specification of the
228+
extension's API.
229+
230+
Since SYCL implementations must support arbitrary types, we anticipate that
231+
many implementations will already have appropriate reduction variants available
232+
that satisfy the constraints imposed by these new properties. Implementing
233+
support for these new constraints may therefore be as straightforward as
234+
providing a new overload of `sycl::reduction` that overrides the algorithm
235+
selection process.
236+
237+
The steps necessary to guarantee deterministic results are type-dependent. For
238+
integers and built-in combination operators, all implementations should be
239+
deterministic by default. For floating-point numbers and/or custom combination
240+
operators, it becomes necessary to ensure that the intermediate results from
241+
each work-item are always combined in the same order.
242+
243+
244+
== Issues
245+
246+
None.

0 commit comments

Comments
 (0)