|
| 1 | += SYCL_EXT_ONEAPI_GROUP_MASK |
| 2 | +:source-highlighter: coderay |
| 3 | +:coderay-linenums-mode: table |
| 4 | + |
| 5 | +// This section needs to be after the document title. |
| 6 | +:doctype: book |
| 7 | +:toc2: |
| 8 | +:toc: left |
| 9 | +:encoding: utf-8 |
| 10 | +:lang: en |
| 11 | + |
| 12 | +:blank: pass:[ +] |
| 13 | + |
| 14 | +// Set the default source code type in this document to C++, |
| 15 | +// for syntax highlighting purposes. This is needed because |
| 16 | +// docbook uses c++ and html5 uses cpp. |
| 17 | +:language: {basebackend@docbook:c++:cpp} |
| 18 | + |
| 19 | +== Introduction |
| 20 | +IMPORTANT: This specification is a draft. |
| 21 | + |
| 22 | +NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos. |
| 23 | + |
| 24 | +This document describes an extension which adds a `group_mask` type. Such a mask can be used to efficiently represent subsets of work-items in a group for which a given Boolean condition holds. Group mask functionality is currently limited to groups that are instances of the `sub_group` class. |
| 25 | + |
| 26 | +== Notice |
| 27 | + |
| 28 | +Copyright (c) 2021 Intel Corporation. All rights reserved. |
| 29 | + |
| 30 | +== Status |
| 31 | + |
| 32 | +Working Draft |
| 33 | + |
| 34 | +This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension. |
| 35 | + |
| 36 | +Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products. |
| 37 | + |
| 38 | +== Version |
| 39 | + |
| 40 | +Revision: 1 |
| 41 | + |
| 42 | +== Contact |
| 43 | +John Pennycook, Intel (john 'dot' pennycook 'at' intel 'dot' com) |
| 44 | + |
| 45 | +== Dependencies |
| 46 | + |
| 47 | +This extension is written against the SYCL 2020 specification, Revision 3. |
| 48 | + |
| 49 | +== Feature Test Macro |
| 50 | + |
| 51 | +This extension provides a feature-test macro as described in the core SYCL |
| 52 | +specification section 6.3.3 "Feature test macros". Therefore, an |
| 53 | +implementation supporting this extension must predefine the macro |
| 54 | +`SYCL_EXT_ONEAPI_GROUP_MASK` to one of the values defined in the table below. |
| 55 | +Applications can test for the existence of this macro to determine if the |
| 56 | +implementation supports this feature, or applications can test the macro's |
| 57 | +value to determine which of the extension's APIs the implementation supports. |
| 58 | + |
| 59 | +[%header,cols="1,5"] |
| 60 | +|=== |
| 61 | +|Value |Description |
| 62 | +|1 |Initial extension version. Base features are supported. |
| 63 | +|=== |
| 64 | + |
| 65 | +== Overview |
| 66 | + |
| 67 | +A group mask is an integral type sized such that each work-item in the group is |
| 68 | +represented by a single bit. Such a mask can be used to efficiently represent |
| 69 | +subsets of work-items in a group for which a given Boolean condition holds. |
| 70 | + |
| 71 | +Group mask functionality is currently limited to groups that are instances of |
| 72 | +the `sub_group` class, but this limitation may be lifted in a future version of |
| 73 | +the specification. |
| 74 | + |
| 75 | +=== Ballot |
| 76 | + |
| 77 | +The `group_ballot` algorithm converts a Boolean condition from each work-item |
| 78 | +in the group into a group mask. Like other group algorithms, `group_ballot` |
| 79 | +must be encountered by all work-items in the group in converged control flow. |
| 80 | + |
| 81 | +|=== |
| 82 | +|Function|Description |
| 83 | + |
| 84 | +|`template <typename Group> Group::mask_type group_ballot(Group g, bool predicate = true) const` |
| 85 | +|Return a `group_mask` representing the set of work-items in group _g_ for which _predicate_ is `true`. |
| 86 | +|=== |
| 87 | + |
| 88 | +=== Group Masks |
| 89 | + |
| 90 | +The group mask type is an opaque type, permitting implementations to use any |
| 91 | +mask representation that has the same size and alignment across host and |
| 92 | +device. The maximum number of bits that can be stored in a `group_mask` is |
| 93 | +exposed as a static member variable, `group_mask::max_bits`. |
| 94 | + |
| 95 | +Functions declared in the `group_mask` class can be called independently by |
| 96 | +different work-items in the same group. An instance of a group class (e.g. |
| 97 | +`group` or `sub_group`) is not required to manipulate a group mask. |
| 98 | + |
| 99 | +The mask is defined such that the least significant bit (LSB) corresponds to |
| 100 | +the work-item with id 0, and the most significant bit (MSB) corresponds to the |
| 101 | +work-item with the id `max_local_range()-1`. |
| 102 | + |
| 103 | +|=== |
| 104 | +|Member Function|Description |
| 105 | + |
| 106 | +|`bool operator[](id<1> id) const` |
| 107 | +|Return `true` if the bit corresponding to the specified _id_ is set in the |
| 108 | + mask. |
| 109 | + |
| 110 | +|`group_mask::reference operator[](id<1> id)` |
| 111 | +|Return a reference to the bit corresponding to the specified _id_ in the mask. |
| 112 | + |
| 113 | +|`bool test(id<1> id) const` |
| 114 | +|Return `true` if the bit corresponding to the specified _id_ is set in the |
| 115 | + mask. |
| 116 | + |
| 117 | +|`bool all() const` |
| 118 | +|Return `true` if all bits in the mask are set. |
| 119 | + |
| 120 | +|`bool any() const` |
| 121 | +|Return `true` if any bits in the mask are set. |
| 122 | + |
| 123 | +|`bool none() const` |
| 124 | +|Return `true` if none of the bits in the mask are set. |
| 125 | + |
| 126 | +|`uint32_t count() const` |
| 127 | +|Return the number of bits set in the mask. |
| 128 | + |
| 129 | +|`uint32_t size() const` |
| 130 | +|Return the number of bits in the mask. |
| 131 | + |
| 132 | +|`id<1> find_low() const` |
| 133 | +|Return the lowest `id` with a corresponding bit set in the mask. If no bits |
| 134 | + are set, the return value is equal to `size()`. |
| 135 | + |
| 136 | +|`id<1> find_high() const` |
| 137 | +|Return the highest `id` with a corresponding bit set in the mask. If no bits |
| 138 | + are set, the return value is equal to `size()`. |
| 139 | + |
| 140 | +|`template <typename T = marray<uint32_t, max_bits/sizeof(uint32_t)>> void insert_bits(T bits, id<1> pos = 0)` |
| 141 | +|Insert `CHAR_BIT * sizeof(T)` bits into the mask, starting from _pos_. `T` |
| 142 | + must be an integral type or a SYCL `marray` of integral types. _pos_ must be a |
| 143 | + multiple of `CHAR_BIT * sizeof(T)` in the range [0, `size()`). If _pos_ pass:[+] |
| 144 | + `CHAR_BIT * sizeof(T)` is greater than `size()`, the final `size()` - (_pos_ pass:[+] |
| 145 | + `CHAR_BIT * sizeof(T)`) bits are ignored. |
| 146 | + |
| 147 | +|`template <typename T = marray<uint32_t, max_bits/sizeof(uint32_t)>> T extract_bits(id<1> pos = 0) const` |
| 148 | +|Return `CHAR_BIT * sizeof(T)` bits from the mask, starting from _pos_. `T` |
| 149 | + must be an integral type or a SYCL `marray` of integral types. _pos_ must be a |
| 150 | + multiple of `CHAR_BIT * sizeof(T)` in the range [0, `size()`). If _pos_ pass:[+] |
| 151 | + `CHAR_BIT * sizeof(T)` is greater than `size()`, the final `size()` - (_pos_ pass:[+] |
| 152 | + `CHAR_BIT * sizeof(T)`) bits of the return value are zero. |
| 153 | + |
| 154 | +|`void set()` |
| 155 | +|Set all bits in the mask to true. |
| 156 | + |
| 157 | +|`void set(id<1> id, bool value = true)` |
| 158 | +|Set the bit corresponding to the specified _id_ to the value specified by |
| 159 | + _value_. |
| 160 | + |
| 161 | +|`void reset()` |
| 162 | +|Reset all bits in the mask. |
| 163 | + |
| 164 | +|`void reset(id<1> id)` |
| 165 | +|Reset the bit corresponding to the specified _id_. |
| 166 | + |
| 167 | +|`void reset_low()` |
| 168 | +|Reset the bit for the lowest `id` with a corresponding bit set in the mask. |
| 169 | + Functionally equivalent to `reset(find_low())`. |
| 170 | + |
| 171 | +|`void reset_high()` |
| 172 | +|Reset the bit for the highest `id` with a corresponding bit set in the mask. |
| 173 | + Functionally equivalent to `reset(find_high())`. |
| 174 | + |
| 175 | +|`void flip()` |
| 176 | +|Toggle the values of all bits in the mask. |
| 177 | + |
| 178 | +|`void flip(id<1> id)` |
| 179 | +|Toggle the value of the bit corresponding to the specified _id_. |
| 180 | + |
| 181 | +|`bool operator==(group_mask rhs) const` |
| 182 | +|Return true if each bit in this mask is equal to the corresponding bit in |
| 183 | + `rhs`. |
| 184 | + |
| 185 | +|`bool operator!=(group_mask rhs) const` |
| 186 | +|Return true if any bit in this mask is not equal to the corresponding bit in |
| 187 | + `rhs`. |
| 188 | + |
| 189 | +|`group_mask operator &=(group_mask rhs)` |
| 190 | +|Set the bits of this mask to the result of performing a bitwise AND with this |
| 191 | + mask and `rhs`. |
| 192 | + |
| 193 | +|`group_mask operator \|=(group_mask rhs)` |
| 194 | +|Set the bits of this mask to the result of performing a bitwise OR with this |
| 195 | + mask and `rhs`. |
| 196 | + |
| 197 | +|`group_mask operator ^=(group_mask rhs)` |
| 198 | +|Set the bits of this mask to the result of performing a bitwise XOR with this |
| 199 | + mask and `rhs`. |
| 200 | + |
| 201 | +|`group_mask operator pass:[<<=](size_t shift)` |
| 202 | +|Set the bits of this mask to the result of shifting its bits _shift_ positions |
| 203 | + to the left using a logical shift. Bits that are shifted out to the left are |
| 204 | + discarded, and zeroes are shifted in from the right. |
| 205 | + |
| 206 | +|`group_mask operator >>=(size_t shift)` |
| 207 | +|Set the bits of this mask to the result of shifting its bits _shift_ positions |
| 208 | + to the right using a logical shift. Bits that are shifted out to the right are |
| 209 | + discarded, and zeroes are shifted in from the left. |
| 210 | + |
| 211 | +|`group_mask operator ~() const` |
| 212 | +|Return a mask representing the result of flipping all the bits in this mask. |
| 213 | + |
| 214 | +|`group_mask operator <<(size_t shift)` |
| 215 | +|Return a mask representing the result of shifting its bits _shift_ positions |
| 216 | + to the left using a logical shift. Bits that are shifted out to the left are |
| 217 | + discarded, and zeroes are shifted in from the right. |
| 218 | + |
| 219 | +|`group_mask operator >>(size_t shift)` |
| 220 | +|Return a mask representing the result of shifting its bits _shift_ positions |
| 221 | + to the right using a logical shift. Bits that are shifted out to the right are |
| 222 | + discarded, and zeroes are shifted in from the left. |
| 223 | +|=== |
| 224 | + |
| 225 | +|=== |
| 226 | +|Function|Description |
| 227 | + |
| 228 | +|`group_mask operator &(const group_mask& lhs, const group_mask& rhs)` |
| 229 | +|Return a mask representing the result of performing a bitwise AND of `lhs` and |
| 230 | + `rhs`. |
| 231 | + |
| 232 | +|`group_mask operator \|(const group_mask& lhs, const group_mask& rhs)` |
| 233 | +|Return a mask representing the result of performing a bitwise OR of `lhs` and |
| 234 | + `rhs`. |
| 235 | + |
| 236 | +|`group_mask operator ^(const group_mask& lhs, const group_mask& rhs)` |
| 237 | +|Return a mask representing the result of performing a bitwise XOR of `lhs` and |
| 238 | + `rhs`. |
| 239 | + |
| 240 | +|=== |
| 241 | + |
| 242 | +==== Sample Header |
| 243 | + |
| 244 | +[source, c++] |
| 245 | +---- |
| 246 | +namespace sycl { |
| 247 | +namespace ext { |
| 248 | +namespace oneapi { |
| 249 | +
|
| 250 | +struct group_mask { |
| 251 | +
|
| 252 | + // enable reference to individual bit |
| 253 | + struct reference { |
| 254 | + reference& operator=(bool x); |
| 255 | + reference& operator=(const reference& x); |
| 256 | + bool operator~() const; |
| 257 | + operator bool() const; |
| 258 | + reference& flip(); |
| 259 | + }; |
| 260 | +
|
| 261 | + static constexpr size_t max_bits = /* implementation-defined */; |
| 262 | +
|
| 263 | + bool operator[](id<1> id) const; |
| 264 | + reference operator[](id<1> id); |
| 265 | + bool test(id<1> id) const; |
| 266 | + bool all() const; |
| 267 | + bool any() const; |
| 268 | + bool none() const; |
| 269 | + uint32_t count() const; |
| 270 | + uint32_t size() const; |
| 271 | + id<1> find_low() const; |
| 272 | + id<1> find_high() const; |
| 273 | +
|
| 274 | + template <typename T = marray<uint32_t, max_bits/sizeof(uint32_t)>> |
| 275 | + void insert_bits(T bits, id<1> pos = 0); |
| 276 | +
|
| 277 | + template <typename T = marray<uint32_t, max_bits/sizeof(uint32_t)>> |
| 278 | + T extract_bits(id<1> pos = 0); |
| 279 | +
|
| 280 | + void set(); |
| 281 | + void set(id<1> id, bool value = true); |
| 282 | + void reset(); |
| 283 | + void reset(id<1> id); |
| 284 | + void reset_low(); |
| 285 | + void reset_high(); |
| 286 | + void flip(); |
| 287 | + void flip(id<1> id); |
| 288 | +
|
| 289 | + bool operator==(group_mask rhs) const; |
| 290 | + bool operator!=(group_mask rhs) const; |
| 291 | +
|
| 292 | + group_mask operator &=(group_mask rhs); |
| 293 | + group_mask operator |=(group_mask rhs); |
| 294 | + group_mask operator ^=(group_mask rhs); |
| 295 | + group_mask operator <<=(size_t); |
| 296 | + group_mask operator >>=(size_t rhs); |
| 297 | +
|
| 298 | + group_mask operator ~() const; |
| 299 | + group_mask operator <<(size_t) const; |
| 300 | + group_mask operator >>(size_t) const; |
| 301 | +
|
| 302 | +}; |
| 303 | +
|
| 304 | +group_mask operator &(const group_mask& lhs, const group_mask& rhs); |
| 305 | +group_mask operator |(const group_mask& lhs, const group_mask& rhs); |
| 306 | +group_mask operator ^(const group_mask& lhs, const group_mask& rhs); |
| 307 | +
|
| 308 | +} // namespace oneapi |
| 309 | +} // namespace ext |
| 310 | +} // namespace sycl |
| 311 | +---- |
| 312 | + |
| 313 | +== Issues |
| 314 | + |
| 315 | +None. |
| 316 | + |
| 317 | +//. asd |
| 318 | +//+ |
| 319 | +//-- |
| 320 | +//*RESOLUTION*: Not resolved. |
| 321 | +//-- |
| 322 | +
|
| 323 | +== Revision History |
| 324 | +
|
| 325 | +[cols="5,15,15,70"] |
| 326 | +[grid="rows"] |
| 327 | +[options="header"] |
| 328 | +|======================================== |
| 329 | +|Rev|Date|Author|Changes |
| 330 | +|1|2021-08-11|John Pennycook|*Initial public working draft* |
| 331 | +|======================================== |
| 332 | +
|
| 333 | +//************************************************************************ |
| 334 | +//Other formatting suggestions: |
| 335 | +// |
| 336 | +//* Use *bold* text for host APIs, or [source] syntax highlighting. |
| 337 | +//* Use +mono+ text for device APIs, or [source] syntax highlighting. |
| 338 | +//* Use +mono+ text for extension names, types, or enum values. |
| 339 | +//* Use _italics_ for parameters. |
| 340 | +//************************************************************************ |
0 commit comments