Skip to content

Commit 56c9bcb

Browse files
committed
[SYCL][Doc] Update sub-group extension docs
Splits sub-group functionality into three extensions: - SubGroup (sub_group class and device queries) - SubGroupAlgorithms (GroupAlgorithm support and permute) - GroupMask (sub_group::mask_type and ballot) Signed-off-by: John Pennycook <[email protected]>
1 parent 92e01dc commit 56c9bcb

File tree

7 files changed

+682
-286
lines changed

7 files changed

+682
-286
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# SYCL_INTEL_group_mask
2+
3+
A new `group_mask` class providing an ability to efficiently represent subsets of work-items in a group for which a given Boolean condition holds.
Lines changed: 244 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,244 @@
1+
= SYCL_INTEL_group_mask
2+
:source-highlighter: coderay
3+
:coderay-linenums-mode: table
4+
5+
// This section needs to be after the document title.
6+
:doctype: book
7+
:toc2:
8+
:toc: left
9+
:encoding: utf-8
10+
:lang: en
11+
12+
:blank: pass:[ +]
13+
14+
// Set the default source code type in this document to C++,
15+
// for syntax highlighting purposes. This is needed because
16+
// docbook uses c++ and html5 uses cpp.
17+
:language: {basebackend@docbook:c++:cpp}
18+
19+
== Introduction
20+
IMPORTANT: This specification is a draft.
21+
22+
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc. used by permission by Khronos.
23+
24+
NOTE: This document is better viewed when rendered as html with asciidoctor. GitHub does not render image icons.
25+
26+
This document describes an extension which adds a +group_mask+ type. Such a mask can be used to efficiently represent subsets of work-items in a group for which a given Boolean condition holds. Group mask functionality is currently limited to groups that are instances of the +sub_group+ class.
27+
28+
== Name Strings
29+
30+
+SYCL_INTEL_group_mask+
31+
32+
== Notice
33+
34+
Copyright (c) 2020 Intel Corporation. All rights reserved.
35+
36+
== Status
37+
38+
Working Draft
39+
40+
This is a preview extension specification, intended to provide early access to a feature for review and community feedback. When the feature matures, this specification may be released as a formal extension.
41+
42+
Because the interfaces defined by this specification are not final and are subject to change they are not intended to be used by shipping software products.
43+
44+
== Version
45+
46+
Built On: {docdate} +
47+
Revision: 1
48+
49+
== Contact
50+
John Pennycook, Intel (john 'dot' pennycook 'at' intel 'dot' com)
51+
52+
== Dependencies
53+
54+
This extension is written against the SYCL 1.2.1 specification, Revision v1.2.1-6 and the following extensions:
55+
56+
- +SYCL_INTEL_sub_group+
57+
58+
== Overview
59+
60+
A group mask is an integral type sized such that each work-item in the group is represented by a single bit. Such a mask can be used to efficiently represent subsets of work-items in a group for which a given Boolean condition holds.
61+
62+
Group mask functionality is currently limited to groups that are instances of the +sub_group+ class, but this limitation may be lifted in a future version of the specification.
63+
64+
=== Ballot
65+
66+
The +ballot+ algorithm converts a Boolean condition from each work-item in the group into a group mask. Like other group algorithms, +ballot+ must be encountered by all work-items in the group in converged control flow.
67+
68+
|===
69+
|Member Functions|Description
70+
71+
|+template <typename Group> Group::mask_type ballot(bool predicate = true) const+
72+
|Return a +group_mask+ representing the set of work-items in the group for which _predicate_ is +true+.
73+
|===
74+
75+
=== Group Masks
76+
77+
The group mask type is an opaque type, permitting implementations to use any mask representation subject to the following restrictions:
78+
79+
- The size and alignment of the mask type must be the same on the host and device
80+
- A SYCL implementation supporting OpenCL interoperability must use a 128-bit mask convertible to a +vec<uint,4>+
81+
82+
Functions declared in the +mask+ class can be called independently by different work-items in the same group. An instance of a group class (e.g. +group+ or +sub_group+) is not required to manipulate a group mask.
83+
84+
The mask is defined such that the least significant bit (LSB) corresponds to the work-item with id 0, and the most significant bit (MSB) corresponds to the work-item with the id +max_local_range()-1+.
85+
86+
|===
87+
|Member Function|Description
88+
89+
|+bool operator[](id<1> id) const+
90+
|Return +true+ if the bit corresponding to the specified _id_ is set in the mask.
91+
92+
|+mask::reference operator[](id<1> id) const+
93+
|Return a reference to the bit corresponding to the specified _id_ in the mask.
94+
95+
|+bool test(id<1> id) const+
96+
|Return +true+ if the bit corresponding to the specified _id_ is set in the mask.
97+
98+
|+bool all() const+
99+
|Return +true+ if all bits in the mask are set.
100+
101+
|+bool any() const+
102+
|Return +true+ if any bits in the mask are set.
103+
104+
|+bool none() const+
105+
|Return +true+ if none of the bits in the mask are set.
106+
107+
|+uint32_t count() const+
108+
|Return the number of bits set in the mask.
109+
110+
|+uint32_t size() const+
111+
|Return the number of bits in the mask.
112+
113+
|+id<1> find_low() const+
114+
|Return the lowest +id+ with a corresponding bit set in the mask. If no bits are set, the return value is equal to `size()`.
115+
116+
|+id<1> find_high() const+
117+
|Return the highest +id+ with a corresponding bit set in the mask. If no bits are set, the return value is equal to `size()`.
118+
119+
|+template <typename T = vec<uint32_t,4>> void insert_bits(T bits, id<1> pos = 0)+
120+
|Insert `CHAR_BIT * sizeof(T)` bits into the mask, starting from _pos_. `T` must be an integral type of a SYCL vector of integral types. _pos_ must be a multiple of `CHAR_BIT * sizeof(T)` in the range [0, `size()`). If _pos_ + `CHAR_BIT * sizeof(T)` is greater than `size()`, the final `size()` - (_pos_ + `CHAR_BIT * sizeof(T)`) bits are ignored.
121+
122+
|+template <typename T = vec<uint32_t,4>> T extract_bits(id<1> pos = 0) const+
123+
|Return `CHAR_BIT * sizeof(T)` bits from the mask, starting from _pos_. `T` must be an integral type or a SYCL vector of integral types. _pos_ must be a multiple of `CHAR_BIT * sizeof(T)` in the range [0, `size()`). If _pos_ + `CHAR_BIT * sizeof(T)` is greater than `size()`, the final `size()` - (_pos_ + `CHAR_BIT * sizeof(T)`) bits of the return value are zero.
124+
125+
|+void set()+
126+
|Set all bits in the mask to true.
127+
128+
|+void set(id<1> id, bool value = true)+
129+
|Set the bit corresponding to the specified _id_ to the value specified by _value_.
130+
131+
|+void reset()+
132+
|Reset all bits in the mask.
133+
134+
|+void reset(id<1> id)+
135+
|Reset the bit corresponding to the specified _id_.
136+
137+
|+void reset_low()+
138+
|Reset the bit for the lowest +id+ with a corresponding bit set in the mask. Functionally equivalent to +reset(find_low())+.
139+
140+
|+void reset_high()+
141+
|Reset the bit for the highest +id+ with a corresponding bit set in the mask. Functionally equivalent to +reset(find_high())+.
142+
143+
|+void flip()+
144+
|Toggle the values of all bits in the mask.
145+
146+
|+void flip(id<1> id)+
147+
|Toggle the value of the bit corresponding to the specified _id_.
148+
149+
|===
150+
151+
==== Sample Header
152+
153+
[source, c++]
154+
----
155+
namespace cl {
156+
namespace sycl {
157+
namespace intel {
158+
159+
struct group_mask {
160+
161+
// enable reference to individual bit
162+
struct reference {
163+
reference& operator=(bool x);
164+
reference& operator=(const reference& x);
165+
bool operator~() const;
166+
operator bool() const;
167+
reference& flip();
168+
};
169+
170+
bool operator[](id<1> id) const;
171+
reference operator[](id<1> id) const;
172+
bool test(id<1> id) const;
173+
bool all() const;
174+
bool any() const;
175+
bool none() const;
176+
uint32_t count() const;
177+
uint32_t size() const;
178+
id<1> find_low() const;
179+
id<1> find_high() const;
180+
181+
template <typename T = vec<uint32_t,4>>
182+
void insert_bits(T bits, id<1> pos = 0);
183+
184+
template <typename T = vec<uint32_t,4>>
185+
T extract_bits(id<1> pos = 0);
186+
187+
void set();
188+
void set(id<1> id, bool value = true);
189+
void reset();
190+
void reset(id<1> id);
191+
void reset_low();
192+
void reset_high();
193+
void flip();
194+
void flip(id<1> id);
195+
196+
bool operator==(mask rhs) const;
197+
bool operator!=(mask rhs) const;
198+
199+
mask operator &=(mask rhs);
200+
mask operator |=(mask rhs);
201+
mask operator ^=(mask rhs);
202+
mask operator ~() const;
203+
mask operator <<=(mask rhs);
204+
mask operator >>=(mask rhs);
205+
206+
mask operator &(mask rhs) const;
207+
mask operator |(mask rhs) const;
208+
mask operator ^(mask rhs) const;
209+
210+
};
211+
212+
} // intel
213+
} // sycl
214+
} // cl
215+
----
216+
217+
== Issues
218+
219+
None.
220+
221+
//. asd
222+
//+
223+
//--
224+
//*RESOLUTION*: Not resolved.
225+
//--
226+
227+
== Revision History
228+
229+
[cols="5,15,15,70"]
230+
[grid="rows"]
231+
[options="header"]
232+
|========================================
233+
|Rev|Date|Author|Changes
234+
|1|2020-03-16|John Pennycook|*Initial public working draft*
235+
|========================================
236+
237+
//************************************************************************
238+
//Other formatting suggestions:
239+
//
240+
//* Use *bold* text for host APIs, or [source] syntax highlighting.
241+
//* Use +mono+ text for device APIs, or [source] syntax highlighting.
242+
//* Use +mono+ text for extension names, types, or enum values.
243+
//* Use _italics_ for parameters.
244+
//************************************************************************
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# SYCL_INTEL_sub_group
2+
3+
A new `sub_group` class representing an implementation-defined grouping of work-items in a work-group.
4+

0 commit comments

Comments
 (0)