Skip to content

Commit 9f8cc3a

Browse files
authored
[SYCL][Doc] Add SYCL_INTEL_bf16_conversion specification (#4237)
Signed-off-by: Alexey Sotkin <[email protected]>
1 parent 1e3136e commit 9f8cc3a

File tree

1 file changed

+332
-0
lines changed

1 file changed

+332
-0
lines changed
Lines changed: 332 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,332 @@
1+
= SYCL_INTEL_bf16_conversion
2+
3+
:source-highlighter: coderay
4+
:coderay-linenums-mode: table
5+
6+
// This section needs to be after the document title.
7+
:doctype: book
8+
:toc2:
9+
:toc: left
10+
:encoding: utf-8
11+
:lang: en
12+
13+
:blank: pass:[ +]
14+
15+
// Set the default source code type in this document to C++,
16+
// for syntax highlighting purposes. This is needed because
17+
// docbook uses c++ and html5 uses cpp.
18+
:language: {basebackend@docbook:c++:cpp}
19+
20+
// This is necessary for asciidoc, but not for asciidoctor
21+
:cpp: C++
22+
23+
== Notice
24+
25+
IMPORTANT: This specification is a draft.
26+
27+
Copyright (c) 2021 Intel Corporation. All rights reserved.
28+
29+
NOTE: Khronos(R) is a registered trademark and SYCL(TM) and SPIR(TM) are
30+
trademarks of The Khronos Group Inc. OpenCL(TM) is a trademark of Apple Inc.
31+
used by permission by Khronos.
32+
33+
== Dependencies
34+
35+
This extension is written against the SYCL 2020 specification, Revision 3.
36+
37+
== Status
38+
39+
Draft
40+
41+
This is a preview extension specification, intended to provide early access to
42+
a feature for review and community feedback. When the feature matures, this
43+
specification may be released as a formal extension.
44+
45+
Because the interfaces defined by this specification are not final and are
46+
subject to change they are not intended to be used by shipping software
47+
products.
48+
49+
== Version
50+
51+
Revision: 3
52+
53+
== Introduction
54+
55+
This extension adds functionality to convert value of single-precision
56+
floating-point type(`float`) to `bfloat16` type and vice versa. The extension
57+
doesn't add support for `bfloat16` type as such, instead it uses 16-bit integer
58+
type(`uint16_t`) as a storage for `bfloat16` values.
59+
60+
The purpose of conversion from float to bfloat16 is to reduce ammount of memory
61+
required to store floating-point numbers. Computations are expected to be done with
62+
32-bit floating-point values.
63+
64+
This extension is an optional kernel feature as described in
65+
https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:optional-kernel-features[section 5.7]
66+
of the SYCL 2020 spec. Therefore, attempting to submit a kernel using this
67+
feature to a device that does not support it should cause a synchronous
68+
`errc::kernel_not_supported` exception to be thrown from the kernel invocation
69+
command (e.g. from `parallel_for`).
70+
71+
== Feature test macro
72+
73+
This extension provides a feature-test macro as described in the core SYCL
74+
specification section 6.3.3 "Feature test macros". Therefore, an implementation
75+
supporting this extension must predefine the macro
76+
`SYCL_EXT_INTEL_BF16_CONVERSION` to one of the values defined in the table
77+
below. Applications can test for the existence of this macro to determine if
78+
the implementation supports this feature, or applications can test the macro’s
79+
value to determine which of the extension’s APIs the implementation supports.
80+
81+
[%header,cols="1,5"]
82+
|===
83+
|Value |Description
84+
|1 |Initial extension version. Base features are supported.
85+
|===
86+
87+
== Extension to `enum class aspect`
88+
89+
[source]
90+
----
91+
namespace sycl {
92+
enum class aspect {
93+
...
94+
ext_intel_bf16_conversion
95+
}
96+
}
97+
----
98+
99+
If a SYCL device has the `ext_intel_bf16_conversion` aspect, then it natively
100+
supports conversion of values of `float` type to `bfloat16` and back.
101+
102+
If the device doesn't have the aspect, objects of `bfloat16` class must not be
103+
used in the device code.
104+
105+
== New `bfloat16` class
106+
107+
The `bfloat16` class below provides the conversion functionality. Conversion
108+
from `float` to `bfloat16` is done with round to nearest even(RTE) rounding
109+
mode.
110+
111+
[source]
112+
----
113+
namespace sycl {
114+
namespace ext {
115+
namespace intel {
116+
namespace experimental {
117+
118+
class bfloat16 {
119+
using storage_t = uint16_t;
120+
storage_t value;
121+
122+
public:
123+
bfloat16() = default;
124+
bfloat16(const bfloat16 &) = default;
125+
~bfloat16() = default;
126+
127+
// Explicit conversion functions
128+
static storage_t from_float(const float &a);
129+
static float to_float(const storage_t &a);
130+
131+
// Convert from float to bfloat16
132+
bfloat16(const float &a);
133+
bfloat16 &operator=(const float &a);
134+
135+
// Convert from bfloat16 to float
136+
operator float() const;
137+
138+
// Get bfloat16 as uint16.
139+
operator storage_t() const;
140+
141+
// Convert to bool type
142+
explicit operator bool();
143+
144+
friend bfloat16 operator-(bfloat16 &bf) { /* ... */ }
145+
146+
// OP is: prefix ++, --
147+
friend bfloat16 &operatorOP(bfloat16 &bf) { /* ... */ }
148+
149+
// OP is: postfix ++, --
150+
friend bfloat16 operatorOP(bfloat16 &bf, int) { /* ... */ }
151+
152+
// OP is: +=, -=, *=, /=
153+
friend bfloat16 &operatorOP(bfloat16 &lhs, const bfloat16 &rhs) { /* ... */ }
154+
155+
// OP is +, -, *, /
156+
friend bfloat16 operatorOP(const bfloat16 &lhs, const bfloat16 &rhs) { /* ... */ }
157+
template <typename T>
158+
friend bfloat16 operatorOP(const bfloat16 &lhs, const T &rhs) { /* ... */ }
159+
template <typename T>
160+
friend bfloat16 operatorOP(const T &lhs, const bfloat16 &rhs) { /* ... */ }
161+
162+
// OP is ==,!=, <, >, <=, >=
163+
friend bool operatorOP(const bfloat16 &lhs, const bfloat16 &rhs) { /* ... */ }
164+
template <typename T>
165+
friend bool operatorOP(const bfloat16 &lhs, const T &rhs) { /* ... */ }
166+
template <typename T>
167+
friend bool operatorOP(const T &lhs, const bfloat16 &rhs) { /* ... */ }
168+
};
169+
170+
} // namespace experimental
171+
} // namespace intel
172+
} // namespace ext
173+
} // namespace sycl
174+
----
175+
176+
Table 1. Member functions of `bfloat16` class.
177+
|===
178+
| Member Function | Description
179+
180+
| `static storage_t from_float(const float &a);`
181+
| Explicitly convert from `float` to `bfloat16`.
182+
183+
| `static float to_float(const storage_t &a);`
184+
| Interpret `a` as `bfloat16` and explicitly convert it to `float`.
185+
186+
| `bfloat16(const float& a);`
187+
| Construct `bfloat16` from `float`. Converts `float` to `bfloat16`.
188+
189+
| `bfloat16 &operator=(const float &a);`
190+
| Replace the value with `a` converted to `bfloat16`
191+
192+
| `operator float() const;`
193+
| Return `bfloat16` value converted to `float`.
194+
195+
| `operator storage_t() const;`
196+
| Return `uint16_t` value, whose bits represent `bfloat16` value.
197+
198+
| `explicit operator bool() { /* ... */ }`
199+
| Convert `bfloat16` to `bool` type. Return `false` if the value equals to
200+
zero, return `true` otherwise.
201+
202+
| `friend bfloat16 operator-(bfloat16 &bf) { /* ... */ }`
203+
| Construct new instance of `bfloat16` class with negated value of the `bf`.
204+
205+
| `friend bfloat16 &operatorOP(bfloat16 &bf) { /* ... */ }`
206+
| Perform an in-place `OP` prefix arithmetic operation on the `bf`,
207+
assigning the result to the `bf` and return the `bf`.
208+
209+
OP is: `++, --`
210+
211+
| `friend bfloat16 operatorOP(bfloat16 &bf, int) { /* ... */ }`
212+
| Perform an in-place `OP` postfix arithmetic operation on `bf`, assigning
213+
the result to the `bf` and return a copy of `bf` before the operation is
214+
performed.
215+
216+
OP is: `++, --`
217+
218+
| `friend bfloat16 operatorOP(const bfloat16 &lhs, const bfloat16 &rhs) { /* ... */ }`
219+
| Perform an in-place `OP` arithmetic operation between the `lhs` and the `rhs`
220+
and return the `lhs`.
221+
222+
OP is: `+=, -=, *=, /=`
223+
224+
| `friend type operatorOP(const bfloat16 &lhs, const bfloat16 &rhs) { /* ... */ }`
225+
| Construct a new instance of the `bfloat16` class with the value of the new
226+
`bfloat16` instance being the result of an OP arithmetic operation between
227+
the `lhs` `bfloat16` and `rhs` `bfloat16` values.
228+
229+
OP is `+, -, *, /`
230+
231+
| `template <typename T>
232+
friend bfloat16 operatorOP(const bfloat16 &lhs, const T &rhs) { /* ... */ }`
233+
| Construct a new instance of the `bfloat16` class with the value of the new
234+
`bfloat16` instance being the result of an OP arithmetic operation between
235+
the `lhs` `bfloat16` value and `rhs` of template type `T`. Type `T` must be
236+
convertible to `float`.
237+
238+
OP is `+, -, *, /`
239+
240+
| `template <typename T>
241+
friend bfloat16 operatorOP(const T &lhs, const bfloat16 &rhs) { /* ... */ }`
242+
| Construct a new instance of the `bfloat16` class with the value of the new
243+
`bfloat16` instance being the result of an OP arithmetic operation between
244+
the `lhs` of template type `T` and `rhs` `bfloat16` value. Type `T` must be
245+
convertible to `float`.
246+
247+
OP is `+, -, *, /`
248+
249+
| `friend bool operatorOP(const bfloat16 &lhs, const bfloat16 &rhs) { /* ... */ }`
250+
| Perform comparison operation OP between `lhs` `bfloat16` and `rhs` `bfloat16`
251+
values and return the result as a boolean value.
252+
253+
OP is `==, !=, <, >, <=, >=`
254+
255+
| `template <typename T>
256+
friend bool operatorOP(const bfloat16 &lhs, const T &rhs) { /* ... */ }`
257+
| Perform comparison operation OP between `lhs` `bfloat16` and `rhs` of
258+
template type `T` and return the result as a boolean value. Type `T` must be
259+
convertible to `float`.
260+
261+
OP is `==, !=, <, >, <=, >=`
262+
263+
| `template <typename T>
264+
friend bool operatorOP(const T &lhs, const bfloat16 &rhs) { /* ... */ }`
265+
| Perform comparison operation OP between `lhs` of template type `T` and `rhs`
266+
`bfloat16` value and return the result as a boolean value. Type `T` must be
267+
convertible to `float`.
268+
269+
OP is `==, !=, <, >, <=, >=`
270+
|===
271+
272+
== Example
273+
274+
[source]
275+
----
276+
#include <sycl/sycl.hpp>
277+
#include <sycl/ext/intel/experimental/bfloat16.hpp>
278+
279+
using sycl::ext::intel::experimental::bfloat16;
280+
281+
bfloat16 operator+(const bfloat16 &lhs, const bfloat16 &rhs) {
282+
return static_cast<float>(lhs) + static_cast<float>(rhs);
283+
}
284+
285+
float foo(float a, float b) {
286+
// Convert from float to bfloat16.
287+
bfloat16 A {a};
288+
bfloat16 B {b};
289+
290+
// Convert A and B from bfloat16 to float, do addition on floating-pointer
291+
// numbers, then convert the result to bfloat16 and store it in C.
292+
bfloat16 C = A + B;
293+
294+
// Return the result converted from bfloat16 to float.
295+
return C;
296+
}
297+
298+
int main (int argc, char *argv[]) {
299+
float data[3] = {7.0, 8.1, 0.0};
300+
sycl::device dev;
301+
sycl::queue deviceQueue{dev};
302+
sycl::buffer<float, 1> buf {data, sycl::range<1> {3}};
303+
304+
if (dev.has(sycl::aspect::ext_intel_bf16_conversion)) {
305+
deviceQueue.submit ([&] (sycl::handler& cgh) {
306+
auto numbers = buf.get_access<sycl::access::mode::read_write> (cgh);
307+
cgh.single_task<class simple_kernel> ([=] () {
308+
numbers[2] = foo(numbers[0], numbers[1]);
309+
});
310+
});
311+
}
312+
return 0;
313+
}
314+
----
315+
316+
== Issues
317+
318+
None.
319+
320+
== Revision History
321+
322+
[cols="5,15,15,70"]
323+
[grid="rows"]
324+
[options="header"]
325+
|========================================
326+
|Rev|Date|Author|Changes
327+
|1|2021-08-02|Alexey Sotkin |Initial public working draft
328+
|2|2021-08-17|Alexey Sotkin |Add explicit conversion functions +
329+
Add operator overloadings +
330+
Apply code review suggestions
331+
|3|2021-08-18|Alexey Sotkin |Remove `uint16_t` constructor
332+
|========================================

0 commit comments

Comments
 (0)