You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sycl/doc/extensions/sub_group_ndrange/sub_group_ndrange.md
+35-35Lines changed: 35 additions & 35 deletions
Original file line number
Diff line number
Diff line change
@@ -131,8 +131,8 @@ The vote / ballot sub-group functions communicate Boolean conditions between the
131
131
132
132
|Member functions|Description|
133
133
|----------------|-----------|
134
-
|`bool any(bool predicate)`| Return `true` if `predicate` evaluates to `true` for any work-item in the sub-group. |
135
-
|`bool all(bool predicate)`| Return `true` if `predicate` evaluates to `true` for all work-items in the sub-group. |
134
+
|`bool any(bool predicate) const`| Return `true` if `predicate` evaluates to `true` for any work-item in the sub-group. |
135
+
|`bool all(bool predicate) const`| Return `true` if `predicate` evaluates to `true` for all work-items in the sub-group. |
136
136
137
137
### Collectives
138
138
@@ -142,10 +142,10 @@ The `plus`, `minimum` and `maximum` functors in the `cl::sycl` namespace corresp
142
142
143
143
|Member functions|Description|
144
144
|----------------|-----------|
145
-
|`template <typename T>T broadcast(T x, id<1> local_id)`| Broadcast the value of `x` from the work-item with the specified id to all work-items within the sub-group. The value of `local_id` must be the same for all work-items in the sub-group. |
146
-
|`template <typename T, class BinaryOp>T reduce(T x, T init, BinaryOp binary_op)`| Combine the values of `x` from all work-items in the sub-group using the specified operator, which must be one of: `plus`, `minimum` or `maximum`. |
147
-
|`template <typename T, class BinaryOp>T exclusive_scan(T x, T init, BinaryOp binary_op)`| Perform an exclusive scan over the values of `x` from all work-items in the sub-group using the specified operator, which must be one of: `plus`, `minimum` or `maximum`. The value returned on work-item `i` is the exclusive scan of the first `i` work-items in the sub-group. |
148
-
|`template <typename T, class BinaryOp>T inclusive_scan(T x, BinaryOp binary_op, T init)`| Perform an inclusive scan over the values of `x` from all work-items in the sub-group using the specified operator, which must be one of: `plus`, `minimum` or `maximum`. The value returned on work-item `i` is the inclusive scan of the first `i` work-items in the sub-group. |
145
+
|`template <typename T>T broadcast(T x, id<1> local_id) const`| Broadcast the value of `x` from the work-item with the specified id to all work-items within the sub-group. The value of `local_id` must be the same for all work-items in the sub-group. |
146
+
|`template <typename T, class BinaryOp>T reduce(T x, T init, BinaryOp binary_op) const`| Combine the values of `x` from all work-items in the sub-group using the specified operator, which must be one of: `plus`, `minimum` or `maximum`. |
147
+
|`template <typename T, class BinaryOp>T exclusive_scan(T x, T init, BinaryOp binary_op) const`| Perform an exclusive scan over the values of `x` from all work-items in the sub-group using the specified operator, which must be one of: `plus`, `minimum` or `maximum`. The value returned on work-item `i` is the exclusive scan of the first `i` work-items in the sub-group. |
148
+
|`template <typename T, class BinaryOp>T inclusive_scan(T x, BinaryOp binary_op, T init) const`| Perform an inclusive scan over the values of `x` from all work-items in the sub-group using the specified operator, which must be one of: `plus`, `minimum` or `maximum`. The value returned on work-item `i` is the inclusive scan of the first `i` work-items in the sub-group. |
149
149
150
150
## Extended Functionality
151
151
@@ -155,31 +155,31 @@ The shuffle sub-group functions perform arbitrary communication between pairs of
155
155
156
156
|Member functions|Description|
157
157
|----------------|-----------|
158
-
|`template <typename T>T shuffle(T x, id<1> local_id)`| Exchange values of `x` between work-items in the sub-group in an arbitrary pattern. Returns the value of `x` from the work-item with the specified id. The value of `local_id` must be between 0 and the sub-group size. |
159
-
|`template <typename T>T shuffle_down(T x, uint32_t delta)`| Exchange values of `x` between work-items in the sub-group via a shift. Returns the value of `x` from the work-item whose id is `delta` larger than the calling work-item. The value returned when the result of id + `delta` is greater than or equal to the sub-group size is undefined. |
160
-
|`template <typename T>T shuffle_up(T x, uint32_t delta)`| Exchange values of `x` between work-items in the sub-group via a shift. Returns the value of `x` from the work-item whose id is `delta` smaller than the calling work-item. The value of returned when the result of id - `delta` is less than zero is undefined. |
161
-
|`template <typename T>T shuffle_xor(T x, id<1> mask)`| Exchange pairs of values of `x` between work-items in the sub-group. Returns the value of `x` from the work-item whose id is equal to the exclusive-or of the calling work-item's id and `mask`. `mask` must be a compile-time constant value that is the same for all work-items in the sub-group. |
158
+
|`template <typename T>T shuffle(T x, id<1> local_id) const`| Exchange values of `x` between work-items in the sub-group in an arbitrary pattern. Returns the value of `x` from the work-item with the specified id. The value of `local_id` must be between 0 and the sub-group size. |
159
+
|`template <typename T>T shuffle_down(T x, uint32_t delta) const`| Exchange values of `x` between work-items in the sub-group via a shift. Returns the value of `x` from the work-item whose id is `delta` larger than the calling work-item. The value returned when the result of id + `delta` is greater than or equal to the sub-group size is undefined. |
160
+
|`template <typename T>T shuffle_up(T x, uint32_t delta) const`| Exchange values of `x` between work-items in the sub-group via a shift. Returns the value of `x` from the work-item whose id is `delta` smaller than the calling work-item. The value of returned when the result of id - `delta` is less than zero is undefined. |
161
+
|`template <typename T>T shuffle_xor(T x, id<1> mask) const`| Exchange pairs of values of `x` between work-items in the sub-group. Returns the value of `x` from the work-item whose id is equal to the exclusive-or of the calling work-item's id and `mask`. `mask` must be a compile-time constant value that is the same for all work-items in the sub-group. |
162
162
163
163
### Two-Input Shuffles
164
164
165
165
This proposal makes a distinction between shuffles with one input per work-item and shuffles with two inputs per work-item. The two-input versions map naturally to SIMD execution (see the `shuffle2` vector operation from OpenCL), and enable developers to avoid certain undefined behaviors from the one-input versions. The simplest way to think of the two-input shuffles is that their operation is equivalent to a one-input shuffle on a virtual sub-group twice as big.
166
166
167
167
|Member functions|Description|
168
168
|----------------|-----------|
169
-
|`template <typename T>T shuffle(T x, T y, id<1> local_id)`| Exchange values of `x` and `y` between work-items in the sub-group in an arbitrary pattern. If `local_id` is between 0 and the sub-group size, returns the value of `x` from the work-item with the specified id; if `local_id` is between the sub-group size and twice the sub-group size, returns the value of `y` from the work-item with the specified id (modulo the sub-group size). The value of `local_id` must be between 0 and twice the sub-group size. |
170
-
|`template <typename T>T shuffle_down(T x, T y, uint32_t delta)`| Exchange values of `x` and `y` between work-items in the sub-group via a shift. If the calling work-item's id + `delta` is between 0 and the sub-group size, returns the value of `x` from the work-item whose id is `delta` larger than the calling work-item; if the calling work-item's id + `delta` is between the sub-group size and twice the sub-group size, returns the value of `y` from the work-item with the specified id (modulo the sub-group size). The value of `delta` must be less than the sub-group size. |
171
-
|`template <typename T>T shuffle_up(T x, T y, uint32_t delta)`| Exchange values of `x` and `y` between work-items in the sub-group via a shift. If the calling work-item's id - `delta` is between 0 and the sub-group size, returns the value of `x` from the work-item whose id is `delta` smaller than the calling work-item; if the calling work-item's id - `delta` is between the sub-group size and twice the sub-group size, returns the value of `y` from the work-item with the specified id (modulo the sub-group size). The value of `delta` must be less than the sub-group size. |
169
+
|`template <typename T>T shuffle(T x, T y, id<1> local_id) const`| Exchange values of `x` and `y` between work-items in the sub-group in an arbitrary pattern. If `local_id` is between 0 and the sub-group size, returns the value of `x` from the work-item with the specified id; if `local_id` is between the sub-group size and twice the sub-group size, returns the value of `y` from the work-item with the specified id (modulo the sub-group size). The value of `local_id` must be between 0 and twice the sub-group size. |
170
+
|`template <typename T>T shuffle_down(T x, T y, uint32_t delta) const`| Exchange values of `x` and `y` between work-items in the sub-group via a shift. If the calling work-item's id + `delta` is between 0 and the sub-group size, returns the value of `x` from the work-item whose id is `delta` larger than the calling work-item; if the calling work-item's id + `delta` is between the sub-group size and twice the sub-group size, returns the value of `y` from the work-item with the specified id (modulo the sub-group size). The value of `delta` must be less than the sub-group size. |
171
+
|`template <typename T>T shuffle_up(T x, T y, uint32_t delta) const`| Exchange values of `x` and `y` between work-items in the sub-group via a shift. If the calling work-item's id - `delta` is between 0 and the sub-group size, returns the value of `x` from the work-item whose id is `delta` smaller than the calling work-item; if the calling work-item's id - `delta` is between the sub-group size and twice the sub-group size, returns the value of `y` from the work-item with the specified id (modulo the sub-group size). The value of `delta` must be less than the sub-group size. |
172
172
173
173
### Loads / Stores
174
174
175
175
The load and store sub-group functions enable developers to assert that all work-items in a sub-group read/write from/to contiguous locations in memory. Such operations can be mapped directly to SIMD operations.
176
176
177
177
|Member functions|Description|
178
178
|----------------|-----------|
179
-
|`template <typename T, access::address_space Space>T load(const multi_ptr<T,Space> src)`| Load contiguous data from `src`. Returns one element per work-item, corresponding to the memory location at `src` + `get_local_id()`. The value of `src` must be the same for all work-items in the sub-group. |
180
-
|`template <int N, typename T, access::address_space Space>vec<T,N> load(const multi_ptr<T,Space> src)`| Load contiguous data from `src`. Returns `N` elements per work-item, corresponding to the `N` memory locations at `src` + `i` * `get_max_local_range()` + `get_local_id()` for `i` between 0 and `N`. The value of `src` must be the same for all work-items in the sub-group. |
181
-
|`template <typename T, access::address_space Space>void store(multi_ptr<T,Space> dst, const T& x)`| Store contiguous data to `dst`. The value of `x` from each work-item is written to the memory location at `dst` + `get_local_id()`. The value of `dst` must be the same for all work-items in the sub-group. |
182
-
|`template <int N, typename T, access::address_space Space>void store(multi_ptr<T,Space> dst, const vec<T,N>& x)`| Store contiguous data to `dst`. The `N` elements from each work-item are written to the memory locations at `dst` + `i` * `get_max_local_range()` + `get_local_id()` for `i` between 0 and `N`. The value of `dst` must be the same for all work-items in the sub-group. |
179
+
|`template <typename T, access::address_space Space>T load(const multi_ptr<T,Space> src) const`| Load contiguous data from `src`. Returns one element per work-item, corresponding to the memory location at `src` + `get_local_id()`. The value of `src` must be the same for all work-items in the sub-group. |
180
+
|`template <int N, typename T, access::address_space Space>vec<T,N> load(const multi_ptr<T,Space> src) const`| Load contiguous data from `src`. Returns `N` elements per work-item, corresponding to the `N` memory locations at `src` + `i` * `get_max_local_range()` + `get_local_id()` for `i` between 0 and `N`. The value of `src` must be the same for all work-items in the sub-group. |
181
+
|`template <typename T, access::address_space Space>void store(multi_ptr<T,Space> dst, const T& x) const`| Store contiguous data to `dst`. The value of `x` from each work-item is written to the memory location at `dst` + `get_local_id()`. The value of `dst` must be the same for all work-items in the sub-group. |
182
+
|`template <int N, typename T, access::address_space Space>void store(multi_ptr<T,Space> dst, const vec<T,N>& x) const`| Store contiguous data to `dst`. The `N` elements from each work-item are written to the memory locations at `dst` + `i` * `get_max_local_range()` + `get_local_id()` for `i` between 0 and `N`. The value of `dst` must be the same for all work-items in the sub-group. |
183
183
184
184
# Sample Header
185
185
@@ -195,7 +195,7 @@ struct sub_group {
195
195
196
196
range<1> get_local_range() const;
197
197
198
-
range<1> get_max_local_range();
198
+
range<1> get_max_local_range() const;
199
199
200
200
id<1> get_group_id() const;
201
201
@@ -205,62 +205,62 @@ struct sub_group {
205
205
206
206
/* --- vote/ballot functions --- */
207
207
208
-
bool any(bool predicate);
208
+
bool any(bool predicate) const;
209
209
210
-
bool all(bool predicate);
210
+
bool all(bool predicate) const;
211
211
212
212
/* --- data-sharing --- */
213
213
214
214
template <typename T>
215
-
T broadcast(T x, id<1> local_id);
215
+
T broadcast(T x, id<1> local_id) const;
216
216
217
217
template <typename T, class BinaryOp>
218
-
T reduce(T x, T init, BinaryOp binary_op);
218
+
T reduce(T x, T init, BinaryOp binary_op) const;
219
219
220
220
template <typename T, class BinaryOp>
221
-
T exclusive_scan(T x, T init, BinaryOp binary_op);
221
+
T exclusive_scan(T x, T init, BinaryOp binary_op) const;
222
222
223
223
template <typename T, class BinaryOp>
224
-
T inclusive_scan(T x, BinaryOp binary_op, T init);
224
+
T inclusive_scan(T x, BinaryOp binary_op, T init) const;
225
225
226
226
/* --- one-input shuffles --- */
227
227
228
228
template <typename T>
229
-
T shuffle(T x, id<1> local_id);
229
+
T shuffle(T x, id<1> local_id) const;
230
230
231
231
template <typename T>
232
-
T shuffle_down(T x, uint32_t delta);
232
+
T shuffle_down(T x, uint32_t delta) const;
233
233
234
234
template <typename T>
235
-
T shuffle_up(T x, uint32_t delta);
235
+
T shuffle_up(T x, uint32_t delta) const;
236
236
237
237
template <typename T>
238
-
T shuffle_xor(T x, id<1> value);
238
+
T shuffle_xor(T x, id<1> value) const;
239
239
240
240
/* --- two-input shuffles --- */
241
241
242
242
template <typename T>
243
-
T shuffle(T x, T y, id<1> local_id);
243
+
T shuffle(T x, T y, id<1> local_id) const;
244
244
245
245
template <typename T>
246
-
T shuffle_down(T current, T next, uint32_t delta);
246
+
T shuffle_down(T current, T next, uint32_t delta) const;
247
247
248
248
template <typename T>
249
-
T shuffle_up(T previous, T current, uint32_t delta);
249
+
T shuffle_up(T previous, T current, uint32_t delta) const;
0 commit comments